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Abstract 
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OhI We consider a three-terminal state-dependent relay channel with the channel state available 



non-causally at only the source. Such a model may be of interest for node cooperation in the 
framework of cognition, i.e., collaborative signal transmission involving cognitive and non- 
cognitive radios. We study the capacity of this communication model. One principal problem 
is caused by the relay's not knowing the channel state. For the discrete memoryless (DM) 
model, we establish two lower bounds and an upper bound on channel capacity. The first 
lower bound is obtained by a coding scheme in which the source describes the state of the 
' channel to the relay and destination, which then exploit the gained description for a better 

•/^ ' communication of the source's information message. The coding scheme for the second lower 

O : 6 6 

bound remedies the relay's not knowing the states of the channel by first computing, at the 

■ source, the appropriate input that the relay would send had the relay known the states of the 

channel, and then transmitting this appropriate input to the relay. The relay simply guesses the 

sent input and sends it in the next block. The upper bound is non trivial and it accounts for 

, not knowing the state at the relay and destination. For the general Gaussian model, we derive 

5— ( ■ lower bounds on the channel capacity by exploiting ideas in the spirit of those we use for the 

■ 
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DM model; and we show that these bounds are optimal for small and large noise at the relay 
irrespective to the strength of the interference. Furthermore, we also consider a special case 
model in which the source input has two components one of which is independent of the state. 
We establish a better upper bound for both DM and Gaussian cases and we also characterize 
the capacity in a number of special cases. 



Index Terms 



User cooperation, relay channel, cognitive radio, channel state information, dirty paper 
coding. 

I. Introduction 

We consider a three-terminal state-dependent relay channel (RC) in which, as shown in Figure [T] the source 
wants to communicate a message W to the destination through the state-dependent RC in n uses of the channel, 
with the help of the relay. The channel outputs Y'^ and Yj for the relay and the destination, respectively, are 
controlled by the channel input from the source, the relay input and the channel state S", through a given 
memoryless probability law Wy^ y3|Xi,X2,s- The channel state S" is generated according to the n-product of a given 
memoryless probability law Qg. It is assumed that the channel state is known, noncausally to only the source. The 
destination estimates the message sent by the source from the received channel output. In this paper we study the 
capacity of this communication system. We will refer to the model in Figure [T] as general state-dependent RC with 
informed source. 
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Fig. 1. General state-dependent relay channel with state information S" available non-causally at only the source. 

We shall also study an important special case of the general model, shown in FigureEl In this special model, the 
source alphabet Xi = Xm X Xw, Xj = (X"j^,X"p) and only the component X"^ knows the states S". Furthermore, 
the memoryless conditional law WYj.YilXui.Xm.Xi.s factorizes as 

l^r2,r3|XiR,XiD,X2,s = IVr2|XiR,sWY3|XiD,X2,s- (1) 

One can think of the two source encoder components in Figure |2] as being two non-colocated base stations 
transmitting a common message to some destination with the help of a relay - the common message may be 
obtained by means of message cognition at the encoder whose input is heard at the relay. 



January 19, 2013 



DRAFT 



3 



WeW 




WeW 



Fig. 2. State-dependent relay channel with the source input X" = (X5'j^,X"j,), and only the component X"j, knowing 
the states of the channel non-causally. 



A. Background and Related Work 

Channels whose probabilistic input-output relation depends on random parameters, or charmel states, have 
spurred much interest and can model a large variety of problems, each related to some physical situation of 
interest. The random state sequence may be known in a causal or non-causal manner. For single user models, the 
concept of channel state available at only the transmitter dates back to Shannon [1] for the causal channel state case, 
and to Gel'fand and Finsker [2] for the non-causal channel state case. In [3], Heegard and El Gamal study a model 
in which the state sequence is known non-causally to only the encoder or to only the decoder. They also derive 
achievable rates for the case in which partial channel state information (CSl) is given at varying rates to both the 
encoder and the decoder. In [4], Costa studies an additive Gaussian channel with additive Gaussian state known 
at only the encoder, and shows that Gel'fand-Pinsker coding with a specific auxiliary random variable, known as 
dirty paper coding (DPC), achieves the charmel capacity. Interestingly, in this case, the DPC removes the effect of the 
additive channel state on the capacity as if there were no channel state present in the model or the channel state 
were known to the decoder as well. For a comprehensive review of state-dependent channels and related work, 
the reader may refer to [5]. 

A growing body of work studies multi-user state-dependent models. Recent advances in this regard can be 
foxmd in [5]-[26], and many other works. Key to the investigation of a state-dependent model is whether the 
parameters controlling the channel are known to all or only some of the users in the commtinication model. If the 
parameters of the channel are known to only some of the users, the problem exhibits some asymmetry which makes 
its investigation more difficult in general. Also, in this case one has to expect some rate penalty due to the lack of 
knowledge of the state at the uninformed encoders, relative to the case in which all encoders would be informed. 

The state-dependent multiaccess charmel (MAC) with only one informed encoder and degraded message sets is 
considered in [6], [7], [27]-[30]; and the state-dependent relay charmel (RC) with only informed relay is considered 
in [11], [12]. For all these models, the authors develop non-trivial outer or upper bounds that permit to characterize 
the rate loss due to not knowing the state at the uninformed encoders. Key feature to the development of these 
outer or upper bounding techniques is that, in all these models, the uninformed encoder not only does not know 
the channel state but can learn no information about it. 

The model for the RC with informed source that we study in this paper seemingly exhibits some similarities with 
the RC with informed relay considered in [11], [12], and it also connects with the MAC with asjonmetric charmel 
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state and degraded message sets considered in [6]-[8]. However, establishing a non-trivial upper bound for the 
present model is more involved, comparatively. Partly, this is because, here, one uninformed encoder (the relay) is 
also a receiver; and, so, it can potentially get some information about the channel states from directly observing the 
past received sequence from the source. That is, at time the input X2,, of the relay can potentially depend on the 
channel states through its past output Y'^^ = {Y2,\, ■ ■ ■ , ^2,1-1)- For the general model in Figure[TJ the relay can even 
know the states non-causally, potentially. This is because may depend on future values of the state through 
past source inputs Xi y(W, S"), j = \, . . . ,i — 1. For the special case in Figured the relay can know the states only 
strictly-causally , but upper bounding the capacity seems still not easy. In our recent work [31], [32], we have shown 
that, in a multiaccess channel, strictly causal knowledge of the state at one encoder can be beneficial in general 
for the other encoder even if the latter is informed non-causally; but the capacity region is still to be characterized in 
general. Studying networks in which a subset of the nodes know the states non-causally and another subset know 
these states only strictly causally, i.e., networks with mixed - noncausal and strictly causal, states appears to be 
more challenging in general, and is likely to capture additional interest, especially after recent results on the utility 
of strictly causally known states in multiaccess channels [16], [17]. 

B. Main Contributions 

For the general state-dependent RC with informed source shown in Figure |TJ we derive two lower bounds and 
an upper bound on the channel capacity. In the discrete memoryless (DM) case, the first lower bound is obtained 
by a block Markov coding scheme in which the source describes the channel state to the relay and destination ahead 
of time. The source sends a two-layer description of the state consisting of two (possibly correlated) individual 
descriptions intended to be recovered at the relay and destination respectively. The relay recovers the individual 
description intended to it and then utilizes the estimated state as non-causal state information at the transmitter 
to implement collaborative source-relay binning in subsequent blocks, through a combined decode-and-forward 
[33, Theorem 5] and Gel'fand-Pinsker binning [2]. The destination guesses the source's message sent cooperatively 
by the source and relay and the individual description which is intended to it from its output and the previously 
recovered state. The rationale for the coding scheme which we use for the first lower bound is that, had the relay 
known the state with negligible distortion, then efficient cooperative source-relay binning in the spirit of [34] can 
be realized (recall that the model in [34] assumes availability of the state at both source and relay). 

We obtain the second lower bound by a block Markov coding scheme in which, rather than the channel state 
itself, the source describes to the relay the appropriate input that the relay would send had the relay known the 
channel states, assimiing a decode-and-forward relaying strategy. The source sends this description to the relay 
ahead of time. The relay recovers the sent input and retransmits it in the appropriate subsequent block. The 
rationale for the coding scheme which we use for the second lower boimd is that, if the input is produced at the 
source using binning against the known state and if the relay recovers it with negligible error, then all would 
appear as if the relay were informed of the channel state. This is because, from an operational point-of-view, the 
relay actually needs not know the channel state, but, rather, the appropriate input that it would send had it known 
this state. 
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For the state-dependent general model, we also establish an upper bound on the capacity. This upper bound 
is non trivial and it accounts for not knowing the state at the relay and the destination. Then, considering the 
special model of Figured we derive a better upper bound that accounts also for the loss incurred by not knowing 
the state at one of the source encoder components. We show that this upper bound is strictly tighter than the 
max-flow min cut or cut-set upper bound obtained by assuming that the state is available at all nodes. We note 
that upper-bounding techniques for related models with asymmetric channel states, i.e., models with states known 
only at some of the encoders have been developed recently in our previous work [12] for a relay channel with 
states known only at the relay, and in [6]-[8] for a MAC with degraded message sets and states known only 
at one encoder. However, as we mentioned previously, the model that we study in this paper is more involved 
comparatively, essentially because, as a receiver the relay can get information about the unknown state. From this 
angle, our upper boimding techniques here are more linked to our recent works [31], [32]. 

Next, we also consider a memoryless Gaussian model in which the noise and the state are additive and Gaussian. 
The state represents an external interference and is known noncausally to only the source. We derive lower bounds 
on the capacity of the general Gaussian RC with informed source by applying the concepts that we develop for 
the DM case. Similar to the discrete case, one lower bound is based on the idea of describing the state to the relay 
beforehand; the relay recovers it and then utilizes it for collaborative binning in subsequent blocks. The other lower 
bound consists in transmitting to the relay a quantized version of the appropriate input that the relay would send 
had the relay known the channel state. We show that these lower bounds perform well in general and are optimal 
for large and small noise at the relay, respectively, irrespective to the strength of the interference. 

Furthermore, considering a Gaussian version of the special case model shown in Figure ID we also develop a 
non-trivial upper boimd on the capacity that is strictly better than the max-flow min cut or cut-set upper bound. 
We point out the rate loss in the upper bound incurred by the availability of the channel state at only the one source 
encoder component. Using this upper bound, we characterize the channel capacity in a number of cases, including 
when the interference corrupts transmission to the destination but not to the relay. 

C. Outline and Notation 

An outline of the remainder of this paper is as follows. Section describes in more detail the communication 
models that we consider in this work. Section|IlI]provides lower and upper bounds on the capacity of the discrete 
memoryless model. Section |IV] provides lower and upper bound on the capacity of the Gaussian model; and 
characterizes the channel capacity in some cases. Section |V] contains some numerical results and discussions. 
Finally, Section|Vl]concludes the paper. 

We use the following notations throughout the paper. Upper case letters are used to denote random variables, 
e.g., X; lower case letters are used to denote realizations of random variables, e.g., x; and calligraphic letters 
designate alphabets, i.e., T. The probability distribution of a random variable X is denoted by Px(x)- Sometimes, 
for convenience, we write it as Px- We use the notation Ex['] to denote the expectation of random variable X. A 
probability distribution of a random variable Y given X is denoted by Py\x- The set of probability distributions 
defined on an alphabet X is denoted by 7{X). The cardinality of a set X is denoted by \X\. For convenience, the 
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length n vector x" will occasionally be denoted in boldface notation x. The Gaussian distribution with mean /.i 
and variance is denoted by 3Nf(/i, a^). Finally, throughout the paper, logarithms are taken to base 2, and the 
complement to unity of a scalar u G [0, 1] is denoted by u, i.e., u = 1 — u. 

II. System Model and Definitions 

In this section, we formally present our communication model and the related definitions. As shown in Figure 
[TJ we consider a state-dependent relay channel denoted by Wy2,Y3\Xi,X2,s whose outputs Y'^ e ^i^d Y'^ € for 
the relay and the destination, respectively, are controlled by the channel inputs X" € X" from the source and 
e from the relay, along with random states S" e S". It is assumed that the channel state S, at time instant i is 
independently drawn from a given distribution Qs and the channel states S" are non-causally known only at the 
source. 

The source wants to transmit a message W to the destination with the help of the relay, in n channel uses. The 
message W is assumed to be imiformly distributed over the set W = jl, . . . ,Mj. The information rate R is defined 
as logM bits per transmission. 

An (M, n) code for the state-dependent relay channel with informed source consists of an encoding function at the 
source 

(j)"^ : {!,..., M) xS" ^ X'l', 
a sequence of encoding functions at the relay 

(p2,i ■ ^2^1 "^1,1, 

for i = 1, 2, . . . , M, and a decoding function at the destination 

^" ^ |1,...,M|. 

Let a (M, w) code be given. The sequences X" and Xj from the source and the relay, respectively, are transmitted 
across a state-dependent relay channel modeled as a memoryless conditional probability distribution V^Yj^y-AX^^^.s- 
The joint probability mass function on ^Nv.^" yX\y!I]\y}^\'>0^"^ is given by 

n 

V{w,s\x\,xl,y\,yl) = [] Qs(s,)P(^i,|if,s")Pfc,,lyr') 

i=l 

|Xi,X2,s(j/2,i/ y3,il^l,!A X2J, Sj). (2) 

The destination estimates the message sent by the source from the channel output Yy The average probability 
of error is defined as P'^ = lEs[Fr{ip"{Y'^) W|S" = s")]. 

An (e, n, R) code for the state-dependent RC with informed source is an (2"^, n)— code {(p'^,(p2, i/'") having average 
probability of error P" not exceeding e. 

A rate R is said to be achievable if there exists a sequence of {£„, n, R)— codes with lim„^oo e« = 0. The capacity C 
of the state-dependent RC with informed source is defined as the supremum of the set of achievable rates. 

We shall also study the special case model shown in Figure |2l in which the source alphabet Xi = Xi^xXio, 
X" = (Xjj^, X"jj) with the input component X^^^ function of only the message W and the input component X"^ 
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function of (W, S"), i.e., X"j^ = (p"j^(W) and X^^ = (p'^j^(W, S") — (p"j^ and (p'^j^ are the source encoding functions, and 
the conditional distribution Wy2,Y3iXiK,XiD,X2,s factorizing as {ij. 

III. The Discrete Memoryless RC with Informed Source 
In this section, we assume that the alphabets S, Xi, X2, ^2, ^3 in the model are all discrete and finite. 

A. Lower Bound on Channel Capacity: State Description 

The following theorem provides a lower bound on the capacity of the state-dependent general discrete memo- 
ryless RC with informed source. 

Theorem 1: The capacity of the discrete memoryless state-dependent relay channel with informed source is 
lower bounded by 

= max min {I{U; YilV, Sr) - I(U; S, Sd\V, Sr), 

I{U,V;Y3\Sd)-I{U,V;S,Sr\Sd)] (3) 

subject to the constraints 

I(S; Sr) < I{Ur; Y2, Sr\U, V) - I(Ur; S, Sr, Sd\U, V) (4a) 
I(S; Sd) < I{Ud; Y3, SdU V) - I{Ud; S, Sr, Sd\U, V) + [I{U; Y3, Sd\V) - I{U; S, Sr, Sd\V)]. (4b) 
I(S; Sr, Sd) + I{Sr; Sd) < I{Ur; Y2, Sr\U, V) - I(Ur; S, Sr, Sd\U, V) 

+ /(iio; Y3, Sd\U, V) - I{Ud; S, Sr, Sd\U, V) + [I{U; Y3, Sd\V) - I{U; S, Sr, Sd\V)]^ 
-I{Ur;Ud\U,V,S,Sr,Sd) (4c) 

where [x]- = min(x,0),andthemaximizationisoveralljointmeasuresonSxSRXSDX'URX'UDX'UxVxXixX2xy2xy3 
of the form 

^S,SK,SD,UR,UD,UKXi,X2,r2,Y3 

= QsPsr,Sd\S^V\Sr^U\V,sSr,Sd^UrMd\VUSSr,Sd^Xi\Ur,Ud,UXSSr,Sd^X2\V,Sr^^2,Y3\^^ (5) 

and satisfying 

I{V;Y3,Sd)-I{V;Sr)>0. (6) 

Proof: An outline of the proof of Theorem [T] will follow, and complete error analysis appears in Appendix lAl 

The following remarks are useful for a better understanding of the coding scheme which we use to achieve the 
lower boixnd in Theorem[T] 

Remark 1: The intuition for the coding scheme which we use to establish the lower boimd in Theorem [l] is as 
follows. Had the relay known the state, the source and the relay could implement collaborative binning against 
that state for transmission to the destination [34]. Since the source knows the state of the channel non-causally, it 
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can transmit a description of it to the relay ahead of time. The relay recovers the state (with a certain distortion), 
and then utilizes it in the relevant subsequent block through a collaborative binning scheme. The hope is that the 
benefit that the source can get from being assisted by a more capable relay will compensate the loss caused by the 
source's spending some of its resources to make the relay learn the state. 

In general, it may also turn to be useful to send a dedicated description of the state to the destination. The destination 
utilizes the recovered state as side information at the receiver In the coding scheme that we employ to establish 
the lower bound in Theorem[T] in addition to its message, the source also sends a two-layer description of the state 
to the relay and destination; one layer description dedicated for each. The two layers are possibly correlated. The 
relay guesses the source's message and the individual state description which is dedicated to it from the source 
transmission and the previously recovered state description. It then utilizes the new state estimate as non-causal 
state at the encoder for collaborative source-relay binning over the next block, through a combined decode-and- 
forward and Gel'fand-Pinsker binning. The destination guesses the source's message sent cooperatively by the 
source and relay and the individual state description which is dedicated to it from its output and the previously 
recovered state description. 

Remark 2: As it can be seen from the proof in Appendix lAl the source sends the descriptions intended to the 
relay and destination two blocks ahead of time. That is, at the beginning of block i the source describes the state 
vector s[i + 2] to the relay and destination. While one block delay is sufficient to describe the state to the relay, a 
minimum of two blocks is necessary for the state reconstruction at the destination because of the used window 
decoding technique. In the following remark, we will comment onto the relevance of sliding window for decoding 
at the destination for our model. 

Remark 3: The coding scheme that we employ to prove the lower bound in Theorem [1] uses regular encoding 
sliding-window-decoding as a relaying strategy. Backward-decoding at the destination, which has been proved 
to sometimes offer rates higher than those of window-decoding for certain non-classic relaying models [12], is 
also possible for our model. However, here, this would require sending independent descriptions to the relay and 
destination. More specifically, with backward decoding, the individual description intended to the destination 
should correspond to the state sequence that affects block — 1, or an earlier block. That is, in this case the source 
would have to describe a "future" state vector to the relay and a "past" state vector to the destination. When 
the state sequence is i.i.d. across blocks, the two individual descriptions are independent, and, intuitively, this 
independence will cause the source to dedicate more of its rate to transmitting the state (in comparison to with 
sliding window), thus leaving a smaller rate for the transmission of the information message. 

Outline of Proof of Theorem |l] 

A formal proof of Theorem [T] with complete error analysis is given in Appendix [A] We now give a description 
of a random coding scheme which we use to obtain the lower bound given in Theorem[T] This scheme is based on 
an appropriate combination of block Markov encoding [33], Gel'fand-Pinsker binning [2], multiple descriptions 
[35] and Marton's coding for general broadcast channels [36]-[38]. Next, we outline the encoding and decoding 
procedures. 

We transmit in B -H 1 blocks, each of length n. Let s[i\ denote the state sequence controlling the channel in block /, 
with i — 1, . . . ,B + 1. During each of the first B blocks, the source encodes a message w, e [1, 2"^] and sends it over 
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the channel. In addition, during each of the first B — 1 blocks, the source also sends two individual descriptions of 
s[i + 2] intended to be recovered at the relay and destination, respectively. We denote by sr[ir/], iRi e [1,2"'^"], the 
description of sli + 2] intended to be recovered at the relay in block i, at rate Rr, and by Sd[(di]/ 'd/ G [1/ 2"^°], the 
description of s[/ + 2] intended to be recovered at the destination in block i, at rate ^d- For the last two blocks, for 
convenience, we set Wb+i = 1, Urb, 'db) = (1/ 1) and {lrb+i, Idb+i) = (1/ !)• For fixed n, the average (channel coding) 
rate R{B/{B + 1)) of the information message over B + 1 blocks approaches R as B — > +oo, and the average (source 
coding) rates Rr{{B - 1)/(B + 1)) and Rn{{B - 1)/(B + 1)) approach Rr and Rd, respectively, as B — > +oo. 

Codebook generation: Fix a measure Pssr So u,< Ud uv,Xi Xj Y2 Y3 form l|5). Calculate the marginals Pg^, and 

Pg^ induced by this measure. Fix e > 0, and let M = 2"'^^"'^^, 

Jv = 2"[^('''^«)+^l Mr = 2"t^«"^''l Jr = 2"[^("R'S'S8,SDmiO+e] 

with 

Rr = /(!Jr; Y2, Sr\U, V) - /((Jr; S, Sr, SdIU V) - e 

7?D = Wd; Y3, Sd\U, V) - I{Ud; S, Sr, Sd\U, V) + [I{U; ¥3, Sd\V) - I{U; S, Sr, Sd\V)]- - e (8) 
where [x]_ denotes min(x, 0). 

We may assume that first term of the minimization in l lT4t is non-negative, i.e., Yi, SrIV) — I(!i; S, Sr, Sd|1^) > 0. 
We generate two statistically independent codebooks (codebooks 1 and 2) by following the steps outlined below 
twice. We shall use these codebooks for blocks with odd and even indices, respectively. 

1) Generate 2"^" n-vectors Sr[1], . . ., Sr[2"'^"] independently according to a uniform distribution over the set 
T"{Psg) of e-typical Sr n- vectors. 

2) Generate 2"^° w-vectors Sd[1], . . .,Sd[2"^°] independently according to a uniform distribution over the set 
T^iPsr,) e-typical So n- vectors. 

3) Generate JyM independent and identically distributed (i.i.d.) codewords jv(ry', yV)l indexed by w' = 1, . . . , M, 

= 1, . . . , Jv- Each codeword v(it'', jv) is with i.i.d. components drawn according to Pv- 

4) For each codeword v{iv' , jv), generate a collection of JuM codewords {u(iv', jv, w, ju)] indexed by w = 1, . . . , M, 
ju = 1, . . . , Ju- Each codeword u{io', jv, w, ju) is with i.i.d. components drawn according to Pu\v- 

5) For each codeword \{w' ,jv), for each codeword u{w' , jv,w, ju), generate a collection of /rMr codewords 
{viR{w' ,jv,w,ju,k,jR)] indexed by = 1,...,Mr, /r = 1,...,/r. Each codeword ur{w' , jv,w, ju,k, Jr) is with 
i.i.d. components drawn according to Pur\v,u- 

6) For each codeword v{w' ,jv), for each codeword u{iv' , jv,iv, ju), generate a collection of /dMd codewords 
{ud(W, jv, w, ju, I, jo)] indexed by / = 1, . . . , Md, jo = 1/ • • • / /d- Each codeword Ud{zo', jv, 10, ju, h ju) is with 
i.i.d. components drawn according to Puo\v,u- 

7) (Binning a-la Marton [36], [37]): For lr e [1, 2"^"\, define the cells 

B.s = [((R - i)2«[«'<-«'<-^l + 1, (r2"['^«-*«-^1]. 
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Similarly, for lo G [1, 2"*^°], define the cells 



e,„ = [((D - 1)2' 



,n[RD-kD-el 



+ 1,(d2' 



where without loss of generality 



2»[RR-RR-f] and 2"^RD-RD-e] considered to be integer valued. 



Encoding: The encoders at the source and the relay encode messages using codebook 1 for blocks with odd 
indices, and codebook 2 for blocks with even indices. This is done because some of the decoding steps are performed 
jointly over two adjacent blocks, and so having independent codebooks makes the error events corresponding to 
these blocks independent and their probabilities easier to evaluate. 

We pick up the story in block ;'. Let Zf, be the new message to be sent from the source node at the beginning of 
block i, and the message sent in the previous block / - 1. The encoding at the beginning of block i is as follows. 
The source finds, if possible, a pair {lri, (d/) G [1,2"'^'<]x[1,2"'^"] such that {s[i + 2], Sk[(r,], Sd['di]) are jointly typical. 
If such (iRi, Lui) does not exist, simply set ((r,, lui) = (1, 1). We shall show that a successful encoding of s[i + 2] at the 
source is accomplished with high probability provided that n is sufficiently large and 



The source will send the quadruple w„ iRi, lod over the channel. First, let us assume that the relay has decoded 
correctly message Wi-i and the indices (ir/-2/ ^m-i), and the destination has decoded correctly message w,_2 and the 
index loi-i- We shall show that our code construction allows the relay to decode correctly message w, and the index 
LRi and the destination to decode correctly message zti,_i and the index Lui-i at the end of block i (with a probability 
of error < e). Thus, the information state (Wi-a, (r,_i, Lui-z) propagates forward and a recursive calculation of 
the probability of error can be made, yielding a probability of error < (B + l)e. 
We continue with the strategy at the beginning of block i. 

1) The relay knows and (r,_2 and searches for the smallest jy e }v such that v(i(;,_i, jv) is jointly typical with 
Sr[iri-2] (the properties of jointly typical sequences guarantee that, with probability close to one, there exists 
one such jv). Denote this yV by - = jvisRliRi-i], f /-i)- Then the relay sends a vector \2[i] with i.i.d. components 
given v(iy,_i, jy.) and Sr[(r,_2]/ drawn according to the marginal f x2|v,Sk induced by the distribution ([5]l. (For 
i = 1,2, the relay does not know an estimate of the channel state and so it sends some default codeword). 

2) The source first searches for the smallest ju e ]u such that u(iy,_i, /'* ., iVi, ju) is jointly typical with the vector 
s[z], Sr[(r,_2]/ sd['D!-2]) given v(w,_i, ;'*.). (Again, the properties of jointly typical sequences guarantee that 
there exists one such ju)- Denote this ju by = ;u(s[z], sr[(r,_2], ^d{idi-2[, Wi-i, w,). 

3) Next, the source searches for one pair 



Rr>I{S;Sr) 



Rd>I{S;Sd) 



Rr + Rd > I(S;Sr,Sd) + I{Sr;Sd). 



(9) 
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where 

23 = {(uR(w/-i,7*,,iy,,;u,A;R,),UD(jy,-i,/*,^^ s.t. : 

(uR(it;,-i, 7*,, /*,, K jR,), s[i], sr[lr,-2], sd[(d,-2]) g T'^iPu,ss,Soluv) 

{uoilO.-l, 7*,, IV„ 7*,, Z„ 7d,), S[f], Sr[(r,-2], SD[iD,-2]) G T^," (Py^ss^^s^^,^^^,) 

(ur(w,-i, iv„ w„ 7Tn' M). ud(w,-i, 7y„ ly,, 7u„ Z„ 7d,)) g ^."(Pu^ y^^iui^sg^^gj}. (10) 

We shall show that, with high probability, the source will find one such pair provided that n is sufficiently 
large and 

Rr+Rd<Rr + Rd- I{Ur; Ud\U, V, S, Sr, Sd). (11) 

Denote the found pair as (ur(k;,_i, j*^, Wi, 7*^, fc,-, 7*.), udCw,-!, 7^,, w„ 7y,, /J,)). 
4) The source then sends a vector xi[i] with i.i.d. components given the vectors v(w,_i,7'*.), \i{iVi^i, j*.,Wi, j^.), 
ur(w/-i, 7y,, w„ 7y,, ki, 7*), UD(ty,-i, 7^,, w,, 7y,, 7o,) and (s[i], sr[ir,_2], SD[iD!-2]), drawn according to the marginal 
Px,\v,u,UM,SA,So induced by the distribution ^. 
Decoding: Decoding and state reconstruction at the relay are based on classical joint typicality. Decoding and 
state reconstruction at the destination are based on joint typicality and window-decoding. The decoding and 
reconstruction procedures at the end of block i are as follows. 

1) The relay knows and (r,_2 (in fact, the relay knows also (r,_i but does not use it for decoding in this step). 
It declares that (zi),, lri) are sent if there exists a unique triple (ly,, jui, ki), ly, e [1, M], jui G }u, ki e [1, Mr], such 
that u{iVi-i,j*^,zu„]ui), viR{Wi-i, il^,Wi,]ui,h, jRi) are jointly typical with (y2[!l Sr[(r,_2]) given v(tt;,_i, 7*.), for 
some 7'r, e ]r, where 7*. = 7V(sr[(r!-2], Wz-i)- One can show that, with the choice ([8), the decoding error in this 
step is small for sufficiently large n if 

R < I{U; Y2, Sr\V) - I{U; S, Sr, Sd\V). (12) 

If l lT2t is satisfied, the estimate (r, of (r, at the relay is the index of the Bf^, containing the foimd /c„ i.e., fc, e Sf^ . 

2) The destination knows the pair {Wi-2Ji-2) and the index 7j;_j = 7V(sR[tRi-3], Wi-2) and decodes the pair 
(z(;,_i,Z,_i) based on the information received in block i — 1 and block i. It declares that (it;,-!, Z,_i) is sent if 
there is a imique triple (z&,_i, jm-i, Z,-i), z&i-i G [1,M], jm-i e fu, e [1,Md], and a imique jvi G fv, such that 
u{iu,-2, ivi-i'^'^^i-i' fui-i), Ud{io,-2, jy^_^,iVi-i, jui-i,l,-i, jDi-i) are jointly typical with {yjli - 1], Sd[(d/-3]) given 
v(ii',~2/ ivi-i) ^"'^ v(ri',_i, jvi) is jointly typical with (ysfi], sd['d!-2])- One can show that, with the choice JUl, the 
decoding error in this step is small for sufficiently large n if 

R < I{V, U; Ys, Sd) - I{V, U; S, Sr, Sd) 

0<I{V;Y3,Su)-I{V;Sr). (13) 

If il3\ is satisfied, the estimate Tdi-i of (dz-i at the destination is the index of the CfQ,_j containing the found 
//-I, i.e., //-I e ef„_j . Also, the destination obtains the correct index 7* . = 7V(sr[iri-2], wy-i)- 
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□ 

The achievable rate in Theorem[l]requires the relay to decode the message sent by the source fully, and this can 
be rather a severe constraint. We can generalize Theorem[T]by allowing the relay to decode the message sent by the 
source only partially [39]. This can be done by splitting the information sent by the source into two independent 
parts, one part is sent through the relay and the other part is sent directly to the destination. In the following 
corollary, the random variables V, U, Ur and Ud play the same roles as in Theorem [l] and Ui is a new random 
variable that represents the information sent directly to the destination. 

Corollary 1: The capacity of the discrete memoryless state-dependent relay channel with informed source is 
lower bounded by 

= max min {I{U; Yzl V, Sr) - I{U; S, Sd\V, Sr), 

I{U, V; Y3\Sd) - I{U, V; S, Sr\Sd)] + I{Ui; Y3IU, V, So) - I{Ui; S, Sr\U, V, So) (14) 

subject to the constraints 

I(S; Sr) < I{Ur; Y2, Sr\U, V) - I{Ur; S, Sr, Sd\U, V) (15a) 

I(S; Sd) < I{Ud; Ys, SdIUi, U, V) - I{Ud; S, Sr, Sd\Ui, U, V) + [J(!Ji, U; Y^, Sd\V) - I{Ui, U; S, Sr, Sd\V)]^ 

(15b) 

I(S; Sr, Sd) + I{Sr; Sd) < I{Ur; Yj, Sr\U, V) - I(Ur; S, Sr, Sd\U, V) 

+ liUo; Y3, Sd\Ui, U, V) - I{Ud; S, Sr, Sd\Ui, U, V) + [I{Ui, U; Y3, Sd\V) - I{Ui, U; S, Sr, Sd\V)]- 

-I{Ur;Ud\Ui,U,V,S,Sr,Sd) (15c) 

where [xj^ = rmn(x, 0), and the maximization is over all joint measures on S X Sr X Sd X Uj^ X Ud X Hi X U X V X 
3Ci X X2 X ^2 X ^3 of the form 

Ps,Sr,Sd,Ur,Ud,U,V,Xi,X2,Y2,Y3 

= QsPs,oSd\S^V\Sr^U\V,sA<Sd^Ui\V,U,S,Sr,Sd^UioUd\V,UMi,SSr,Sd^Xi\UrMdMXS,S^ 

and satisfying Ui <-> (V, U, S, Sr, Sd) «-> (Jk is a Markov chain and 

0<I{V;Y3,Sd)-I{V;Sr) 
< I{U; Y2\V, Sr) - I{U; S, Sd\V, Sr) 

< I{Ui; YslU V, Sd) - I{Ui; S, Sr\U, V, Sd). (17) 

The proof of Corollary[l]follows by a fair extension of that of Theorem[TJ and so we omit it here for brevity. 

Remark 4: In the coding scheme of Corollary[TJ if the source sends no descriptions of the state to the relay and 
destination, i.e., Sr = Sd = 0, the coding scheme reduces to a generalized Gel'fand-Pinsker binning scheme at the 
source that is combined with partial DE In this case, the relay sends codewords that carry part of the information 
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message and are independent of the channel states. The following achievable ratq^is obtained from Corollary[T]by 
setting Sr = Sd = 0, Ur = Ud = and V = X2 independent of S, as 

R = maxmin[l(U;Y2\X2) + I{Ui;Y3\U,X2)-I{U,Ui;S\X2), ;Ji,X2; Ya) - ^(U, SIX2) } (18) 

with the maximization over joint measures of the form 

-Ps,u,Ui,Xi,X2,y2,r3 = QsPxjPuis.XjPuuXiUS.Xi^Yj.YiIXi.Xi.s (19) 

and satisfying 

< J(!i; Y2IX2) - I{U; SIX2) 

< J(!Ji; YslU X2) - I{Ui; S\U, X2) 

< I{U, Ui; Y3IX2) - I{U, SIX2). (20) 

B. Lower Bound on Channel Capacity: Analog Input Description 

The following theorem provides a lower bound on the capacity of the state-dependent general discrete memo- 
ryless RC with informed source. 

Theorem 2: The capacity of the discrete memoryless state-dependent relay channel with informed source is 
lower bounded by 

= max I{U; Y3) - J(!i; S) (21) 

subject to the constraint 

Z(X; X) < I(Ur; Y2) - I{Ur; S) - I{Ur; U\S) (22) 
where maximization is over all joint measures on S X U X Ur X Xi X X2 X X X X X ^2 X ^3 of the form 

Ps,u,UR,Xi,x2,x,x,y2,y3 

= QsPu,Un\sPxi\u,Un,sPx\u,sPx\x'^x2=x'^y2,y3\Xi,X2,s- (23) 
Proof: The proof of Theorem|2]appears in Appendix|B] 

Remark 5: The rationale for the coding scheme which we use to obtain the lower bound in Theorem |2] is as 
follows. Had the relay known the message to be sent in each block and the state that corrupts the transmission in 
that block, then the relay generates its input using a collaborative Gel'fand-Pinsker scheme as in [34]. 
For our model, the source knows the message that the relay should optimally send in each block (if the relay gets 
the message correctly). It also knows the state sequence that corrupts the transmission in that block. It can then 
generate the appropriate relay input vector that the relay would send had the relay known the message and the 
state. The source can send this vector to the relay ahead of time, and if the relay can estimate it to high accuracy, 

^We note that the achievable rate fl8t is slightly larger than that of [13, Theorem 1]. 
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then collaborative source-relay binning in the sense of [34] is readily realized for transmission from the source and 
relay to the destination. 

More precisely, let us consider transmission in two adjacent blocks i and f + 1 . In the beginning of block i, the source 
sends the information if, of the current block, and, in addition, describes to the relay the input x[f + 1] that the 
relay should send in the next block + 1 had the relay known the message and the state s[; + 1]. Let x[m,] be a 
description of x[f + 1]. The information of block i and the index m, which the source sends in block i are precoded 
using binning against the state that controls transmission in the current block The vector x[i + 1], however, is the 
input that the relay would send in the next block i + 1 had the relay known the state s[f + 1], and so is generated at 
the source using binning against the state s[;' + 1]. The vector x[; + 1], and its description which is sent to the relay 
during block i, are intended to combine coherently with the source transmission in block i + \. 

Remark 6: In the scheme we described briefly in Remark|5l the relay needs only estimate the code vector x[i + 1] 
sent by the source in block /, and transmit the obtained estimate in the next block i + 1. For instance, the relay 
does not need know the information message that the estimated vector actually carries, let alone the state 
sequence s[! + 1] that controls the channel in block i + Thus, from a practical viewpoint, this may be particularly 
convenient for communication with an oblivious relay. Transmission from the source terminal to the relay terminal 
can be regarded as that of an analog source which, in block /, produces a sequence x[; + 1]. This source has to be 
transmitted by the source terminal over a state-dependent channel and reconstructed at the relay terminal. The 
reconstruction error at the relay terminal influences the rate at which information can be decoded reliably at the 
destination by acting as an additional noise term. 

Remark 7: A block Markov encoding is used to establish Theorem|2] In block i, the source transmits the message 
iVi and the index m, of a description x[m,] of the input x[; + 1]. In Theorem |2l the auxiliary random vector U" 
represents the Gel'fand-Pinsker vector associated with the information message and is binned against the state 
S"; and the auxiliary random vector U'^ represents the Gel'fand-Pinsker vector associated with the description 
information ans is binned against (U", S"). 

C. Upper Bounds on Channel Capacity 

As we mentioned in Section Jl the relay does not know the states of the channel directly in our model, but it 
can potentially get some information about S" from the past received sequence from the informed source. More 
precisely, the input of the relay X2J at time depends on the channel states through Y'^^ = {¥2,1, ■ ■ ■ , ^2,1-1) which 
in turn depends on these states through S'^^ and the past source inputs Xi y(W, S"), 7 = 1, . . . ,i — 1. Further, because 
the source knows the states non-causally this dependence may even be non-causal. This aspect makes establishing 
non-trivial upper bounds on the capacity, i.e., bounds that are strictly better than the cut-set upper bound 



The following theorem provides an upper bound on the capacity of the state-dependent general discrete 
memoryless RC with informed source. 




(24) 



not easy. 
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Theorem 3: The capacity of the discrete memoryless state-dependent relay channel with informed source is 
upper-bounded by 

= max min {I{V; Y2, YalU X2) - I(V; S\U, X2), 1{V; Y3) - 1{V; S)] (25) 
where the maximization is over measures of the form 

-Ps,!J,i',Xi,X2,r2,y3 = QsPu\sPx2\u,sPv,Xi\u,s^Y2,Y3\Xi,X2,s- (26) 
and !i e U, y e V are auxiliary random variables with 

|U| < ISIIX1IIX2I (27a) 
|V|<(|S||Xi||X2|f, (27b) 

respectively. 

Proof: The proof of Theorem|3]appears in Appendix ICl 

Note that the relay input X2 depends on the state S in the measure l l26l l, and this reflects our discussion above. Also, 
one can specialize the upper bound l l25l l to the special model of Figure |2] using the channel structure and obtain 
an upper bound on the capacity of this model. Instead, we establish a better upper bound, by better exploiting the 
fact that the input component X^j^ that is heard at the relay does not know the state S" at all in this model, and that 
the relay output Yj"^ is function of only the strictly causal part S'"^ of the state in this case. The result is stated in 
the following theorem. 

Theorem 4: The capacity of the discrete memoryless state-dependent relay model of Figure|2]is upper-bounded 

by 

= max min [i{Xir; Y2IX2, S), J(X2; Y3)} + I(Xid; Y3IX2, S)) (28) 
where the maximization is over all joint measures of the form 

Ps,Xni,XiD,X2,Y2,Y3 = Qs-Px2^'XiK|X2-PxiD|X2,sWY2|XiR,sWy3|XiD,X2,S (29) 

Proof: The proof of Theorem|4] appears in Appendix iDl 

Remark 8: We note that although the output Y',^^ at the relay at time / in the special case model of Figure |2] can 
convey information only about the strictly causal part S'^^ of the state, upper bounding the channel capacity is 
not trivial even in this case. For a related somewhat simpler model, a two-user multiaccess channel with common 
and one individual messages, we have shown recently in [31], [32] that strictly causal knowledge of the state at 
the encoder that sends only the common message can increase the transmission rate of the other encoder in general 
even if this one knows the states non-causally — however, the capacity region is still to be characterized in general. For 
the special case model in Figure |2l it is not clear yet how the relay could exploit optimally the information about 
the state S'"^ that is contained in Y^"^. The second term of the minimization in (28) upper-bounds the information 
that the source and the relay can send to the destination by 

I(X2; Y3) + I(Xid; Y3IX2, S) = J(Xid, X2; Y3IS) - I(X2; S| Y3), (30) 
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which is strictly better than the corresponding term in the cut-set upper bound (24). 

IV. The Gaussian RC with Informed Source 

A. System Model 

In this section, we consider a full-duplex state-dependent RC informed source in which the channel state and 
the noise are additive and Gaussian. In this model, the channel state can model an additive Gaussian interference 
which is assumed to be known (non-causally) to only the source. The channel outputs ¥24 and Y^ i at time instant 
i for the relay and the destination, respectively, are related to the charmel input Xi i from the source and X2J from 
the relay, and the channel state S,, by 

Yi,, = Xi,, + S, + Z2,, (31a) 
Y3,, = Xi,, + X2,, + S, + Z3,,. (31b) 

The channel state S, is zero mean Gaussian random variable with variance Q; and only the source knows the state 
sequence S" (non-causally). The noises Z2,, and Z3 / are zero mean Gaussian random variables with variances N2 
and N3, respectively; and are mutually independent and independent from the state sequence S" and the channel 
inputs (X«,X^'). 

We shall also consider an important special case of the general Gaussian model | |31> for which our bounds will 
be more tight. In this special case, the source input Xij = (Xir Xid,,) with Xi^,, independent of the channel state 
S", and the channel outputs Y2J and Yjj at time instant i for the relay and the destination, respectively, are related 
to the channel inputs from the source and relay and the channel state S, by 

Y2,, = XiR, + S, + Z2,, (32a) 
=XiD,,+X2,, + S, + Z3,, (32b) 



For the general model | |31> , we consider the following individual power constraints on the average transmitted 
power at the source and the relay, 

n n 

Y^Xl^<nP,, 2^X2, <„P2. (33) 

!=1 1=1 

For the special case model J32t , we consider separate power constraints on the average transmitted power at the 
encoder components, 

n n n 

2^x2^,<nPiR, Y,Xl^,<nP^n, Y.Xl<nP2. (34) 

1=1 1=1 1=1 

The definition of a code for the Gaussian model is the same as that given in the discrete case, with the additional 
constraint that the channel inputs should satisfy the appropriate power constraint, (33) or l |34t . 



January 19, 2013 



DRAFT 



17 



B. Lower Bounds on Channel Capacity 

The following theorem provides a lower bound on the capacity of the state-dependent general Gaussian RC 
with informed source. 

Theorem 5: The capacity of the state-dependent Gaussian RC with informed source is lower-bounded by 

4° = max - log (l + ^' J „ I (35) 



where 



and the maximization is over y e [0, 1]. 



D := Pi .. J p (36) 
N2 + yPi 



Remark 9: It is insightful to observe that the rate in Theorem |5] does not depend on the strength of the state 
S. This makes the coding scheme appreciable, particularly for the case of arbitrary strong interference in which 
classical coding schemes suffer greatly from the strong interference unknown at the relay. 

Outline of Proof of Theorem [5| The result in Theorem |2] for the DM case can be extended to memoryless 
channels with discrete time and continuous alphabets using standard techniques [40, Chapter 7]. The proof of 
Theorem |5] follows through evaluation of the lower bound of Theorem |2] using the following jointly Gaussian 
input distribution. For < y < 1, we let X ~ !N(0, P2) and Xir ~ K(0, yPi), with X jointly Gaussian with S with 
E[XS] = 0; and Xir jointly Gaussian with (S,X), with E[XirS] = ]E[XirX] = 0. Also, for < D < P2 given, we 
consider the test channel X = aX + X, where a := 1 — D/P2 and X is a Gaussian random variable with zero mean 
and variance P2 = D(l — D/P2), independent from X and S. Using this test channel, we calculate ]E[(X - X)^] = D 
and E[X2] = P2 - D. 

We use the following choices of the auxiliary random variables in Theorem|2l 



Ur = XiR + aR S + ^ ^' ^ =X), (38) 



where 



a = — ; . and Or = — — -. (39) 

( ^JfP^ + Vf 2 - D)2 + (N3 + D + yPi) yPi + N2 

Through straightforward algebra, which we omit here for brevity, it can be shown that the evaluation of the 
lower bound of Theorem |2] using the above choice gives the lower bound in Theorem|5] 

Alternative Proof of Theorem [Sj The encoding and transmission scheme is as follows. For < y < 1, let 
X ~ J\f(0, P2) and Xir ~ !N(0, yPi), with X jointly Gaussian with S with E[XS] = 0; and Xir jointly Gaussian with 
(S, X), with E[XirS] = E[XirX] = 0. Also, let < D < P2 be given, and consider the test channel X = aX -H X, where 
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a := 1 — D/P2 and X is a Gaussian random variable with zero mean and variance P2 = D(l — D/P2), independent 
from X and S. Using this test channel, we calculate E[(X - X)^] = D and TE[X^] = P2 - D. 

We use the two random variables U and Ur given by (43) to generate the auxiliary codewords Ui and (Jr,; which 
we will use in the sequel. 

As in the discrete case, a block Markov encoding is used. For each block i, let x[;] be a Gaussian signal which carries 
message if, e [1, 2"^] and is obtained via a DPC considering s[f] as noncausal channel state information, as 



{f^^yJ^yiii = ^i^^-^-iii' (40) 

where the components of u[i] are generated i.i.d. using the auxiliary random variable Lf. 

For every block i, the source quantizes x[w,] into x[m,], where w, e [1,2"^]. Using the above test channel, the 
source can encode x[?f ,] successfully at the quantization rate 

R = I{X;X) 

= 2log(# (41) 

Let m, be the index associated with x[it;,+i]. In the beginning of block i, the source sends a superposition of two 
Gaussian vectors, 

IWi 

xi[z] = xiRlm,] + y y^\[w,]. (42) 

In equation l|42), the signal x^^ [mj carries message m, and is obtained via a DPC considering (s[i], x[w,]) as noncausal 
channel state information, as 



xmlnii] = urIi] - aR{sli] + ^ ^^[ly,]), (43) 
where the components of Ur[z] are generated i.i.d. using the auxiliary random variable Ur. 

In the beginning of block ;', the relay has decoded message correctly (this will be justified below) and sends 

X2[t] = x[m,-i]. (44) 

VP2-D 

For the decoding arguments at the source and the relay, we give simple arguments based on intuition (the 
rigorous decoding uses joint typicality). Also, since all the random variables are i.i.d., we sometimes omit the time 
index. The relay decodes the index m, from the received y2[f] at the end of block i. Since signal XiR[m,] is precoded 
at the source against the interference caused by the information message w,, decoding at the relay can be done 
reliably as long as n is large and 

1 / yPn 

^^2i°s(i^ivr)- (4') 

The destination decodes message zVi from the received y3[/] at the end of block considering signal xir[ot,] as 
unknown noise, with 

ysW =xi[z] + X2[f] + s[;] + Z3[!] 

" ^ a/^^^^'^ -^ p^^p ^t'^'-i]) + + + xiR[m/]). (46) 
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Let now x'[;] be the optimal linear estimator of ( ^^x[w/] + -^p^Qx[m,_i]) given x[Wj] under minimum mean 
square error criterion, and e^li] the resulting estimation error. The estimator 'x[f] and the estimation error ex[/] are 
given by 



ex[/] = ^p^x[m,-i] - ^^^x[zv.]. (48) 



We can alternatively write the output y3[/] in l l46l l as 

y3[!] = ^\[w,] + s[i] + [z3[i] + eM + xiR[m,]), (49) 

where 

^-^^^ 

and ex[i] is Gaussian with variance D and is independent of x[w,] and s[f]. 

Now, considering the equivalent form ([49) of the output y^li], it is easy to see that the destination can decode 
message if, correctly at the end of block / as long as n is large and 

R < 1(U; Y3) - 1{U; S) 
Furthermore, combining l l4Tl l and ((45) we get 

No 

D>P2 — — — . (52) 
N2+7/P1 

Finally, observing that the RHS of | |5H decreases with D, we obtain I l35t by taking the equality in ((52) and maximizing 
the RHS of jSlt over ye [0, 1]. This completes the proof. 



We now turn to establish a lower bound on the capacity of the state-dependent Gaussian RC using the idea of 
state transmission. In this section, the source describes the channel state to only the relay. The relay guesses the 
information message and the transmitted state description and then transmits the message cooperatively with the 
source using binning against the state estimate, in a manner similar to that we described for the coding scheme for 
Theorem [1] 

For convenience we define the following quantities Qs(0 and R(-) which we will use throughout the remaining 
sections. 

Definition 1: Let 

Qs{t,Q,D):= {l-tfQ-t{t-2)D 
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for non-negative t, D, P, Q, N, and a G R. 

The following theorem provides a lower bound on the capacity of the state-dependent general Gaussian RC 
with informed source. 

Theorem 6: The capacity of the state-dependent Gaussian RC with informed source is lower-bounded by 
= max min {^(a, (1 - p^^ - pl)ePy., S,^Q,N2 + 0Pi. + Pu), 

I 9 9-9- \ 1 / (Pl2 V^flr + VPI)^ M 

K[a, (1 - p?, - pL).p., i^Q, N3 . .P. . P.) . - log (1 . ^3,,.^,;,v,(,_^.^_;.)gp^,.,pj } 



where 



D = Q (54) 



Q = Qs(«2,Q,D), 5 = l + pi,J^ (55) 



(P12V5^+VPI)' 

"2 = = — (56) 

(P12 + VPI)2 + (1 - P?2 - P?s)0Pl'- + (^^3 + f^D + 0Pi, + Pid) 

and the maximization is over P\r > 0, Pu > such that < Pir -H Pid < Pi, G [0,1], pi2 G [0,1] and pis G 
[-1,0] such that < p^^ p^^ < 1 and a G R such that R{{1 - - p^ jePj,, ^^q^^^ + Qp^^ + p^^) > and 

R{(1 - p^^ - p2,)0Pl,., eQ,N3 + ePir + Pu) + l/21og(l + Pid/(N3 + 0Pi,)) > 0. 

Proof: A formal proof of Theorem[6]appears in AppendixlEl 

An outline of proof of Theorem |6] is as follows. The result in Theorem [l] for the DM case can be extended to 
memoryless channels with discrete time and continuous alphabets using standard techniques [40, Chapter 7]. 
For the state-dependent Gaussian relay channel j3H , we evaluate the rate l ITlt with the following choice of input 
distribution. We choose Sd = 0,Ud = 0. Furthermore, we consider the test channel Sr = aSh-Sr, wherea := 1— D/Q 
and Sr is a Gaussian random variable with zero mean and variance a? = D(l — D/Q), independent from S. The 
random variable X2 is Gaussian with zero mean and variance P2, independent of S and of Sr. The random variable 
Xi is composed of three parts, Xi = Xsr + X^r + Xwd, where Xsr is Gaussian with zero mean and variance 0Pir, 
for some G [0, 1], is independent of S, Sr, X2; and Xwr = pis ^BKJQS + pn ^jOPirlPi^i + X^j^, where X^^^ is 
Gaussian with zero mean and variance (1 — p^^OPir, for some pi2 G [0, 1] and pis G [—1, 0] and is independent of 
Xsr, X2 and (S, Sr); and Xwd is a Gaussian with zero mean and variance Pi^, chosen independently from all the 
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other variables. The auxiliary random variables are chosen as 



V = (pi2 y ^ + l)X2 + a2(pi., ^ ^ + i)Sr (57a) 

U = X'„^ + aUS - a2SR) (57b) 
Ui = XwD + p ^ f " „p ^(1 - a){S - a2SR) (57c) 



with 



ai = j== — ; (58a) 

(Pl2 + VPI)2 + (1 - - pl)dPir + (N3 + £2D + 0Pi, + Pu) 



Through straightforward algebra which is omitted for brevity, it can be shown that the evaluation of l lMl l with the 
aforementioned input distribution gives | [53t . 

Remark 10: The parameter a in Theorem [6] stands for DPC's scale factor in precoding the information message 
against the interference on its way to the relay and to the destination. Because the model j3H has the links to the 
relay and to the destination corrupted by noise terms with distinct variances, one cannot remove the effect of the 
interference on the two links simultaneously via one single DPC as in [9]. This explains why the parameter a is left 
to be optimized over in (53). However, in the spirit of [9], one can improve the rate of Theorem|6]by time sharing 
coding schemes that are similar to the one we employed for Theorem [6] but with different inflation parameters 
tailored respectively for the link to the relay and the link to the destination, as in [13]. 



C. Upper Bounds on Channel Capacity 

Similar to the general DM model, in the general Gaussian model J31t the relay does not know the states of the 
channel directly but can potentially get information about S" from the observed output sequence Y'^^ . Also, Y!^^ 
may even contain information about future values of the state, and this makes establishing upper bounds on the 
capacity that are strictly better than the cut-set upper bound 

J?"P= max min{z(Xi;Y2,y3|S,X2), J(Xi,X2; YsIS)} (59) 

more difficult. 

Note that both Xi and X2 know the state S in J59t . For the special case Gaussian model J32I I, we establish an upper 
boimd that is strictly better than (59\ by accoionting for that the source input component Xirj at time i does not 
know the state S" at all and that the relay output Y'~^ is function of only the strictly causal part of the state in this 
case. The following theorem states the corresponding result. 
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Theorem 7: The capacity of the state-dependent Gaussian relay model (32) is upper-bounded by 
Rj = max mm I - log(l + — ) + - log(l + ), 



1 log(l . i^2-Pn^of ) , 1 . , PM1-P,-PI).X 

2 p,j,(i_p2^_p2^) + (VQ + pi,VP^^)2+N3^ 2 SI N3 



^ — K 1 

(60) 



where the maximization is over parameters pu G [0, 1], pi, e [—1, 0] such that 

p?2 + pi<l. (61) 

Proof: The proof of Theorem[7]appears in AppendixlF] 

Remark 11: Similar to in the DM case, the upper bound in Theorem[7]improves upon the cut-set upper boiond 
through the second term of the minimization. The second term of the minimization is strictly tighter than that of 
the cut-set upper boimd because it accounts for the rate loss incurred by not knowing the state S" at all at the 
source encoder component Xirj that is heard at the relay and that the relay output Y'^^ can depend on the state 
only strictly-causally in this case. Further, investigating closely the proof in Appendix |Fl it can be seen that, by 
opposition to the corresponding DM case, the relay ignores completely any information about the state in the 
multiaccess part of l l60t . 

D. Capacity for Some Special Cases 

In this section, we characterize the capacity for some special Gaussian models. 

1) Capacity for some special cases: An important special case of l l32t is when the interference affects only 
the charmel to the destination, i.e., 

Yz, = XiR,, + Zz,, (62a) 
Ys, = XiD,, + Xz,, + Si + Z3,,. (62b) 

In this case, the upper bound in Theorem [7] is tight. The following theorem characterizes the channel capacity in 
this case. 

Theorem 8: The capacity of the state-dependent Gaussian relay model J62t is given by 
Cg = max mm I - log(l + ^) + 5 log(l + Jj; )' 

2 S^ Pjj,(i_p2^_p2^) + (VQ + pi,VP^^)2+N3'' 2 N3 ''/' 

where the maximization is over parameters pi2 G [0, 1] and pi, € [-1, 0] such that 

Pi2 + pL^1- (64) 
Proof: The proof of Theorem[8]appears in Appendix iGl 
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Remark 12: The computation of the lower bound iG-l\ in the proof of Theorem [H] for the model l l32t gives the 
following rate 

1 . i^.^pn^uf ) ll,g(l,£l££z|kzPM)l (65) 

2 PiD(l-p?2-pi) + (VQ + Pi.VP^^)^+N3'' 2 N3 ''/' 

where the maximization is over parameters pi2 e [0, 1] and pi, G [—1, 0] such that 

p?2+pi<l. (66) 

The achievable rate (65) differs from the upper bound ((60) in Theorem |7] only through the first logarithm term in 
which the state is taken as unknown noise. Substituting p := pis and C := 1 — Pjj ~ Pis ii^ (60) and (65]l, it is easy to 
see that if Pm, Pid, Pi, Q, N2 and N3 satisfy 

^ Pir[PidC + (VQ + pVP^)"+N3] 
N2 < max Q (67) 

Ce[0,l],pe[-1,0] VI - C - P^ V?1d)2 

then the channel capacity is given by 



Cg = max -log 1 + — ; + - log 1 + ^, ■ (68) 



Let us now consider another another special case of i31\ , in which Xi = (Xjr, Xid) with average power constraint 
1,-Li < nPi on X", Y3 = {Y^^\ Y^^) and the conditional distribution Wy, |X]D,s,X2 factorizes as ^Yii)|X2^r<^'|XiD s' 

Yi,, = XiR,, + S, + Z2,i (69a) 



= XiD,, + S, + Z« (69b) 
Y^=X2,+zf^, (69c) 

where the noises Z^^' and Z^^' are zero mean Gaussian random variables with variances N3, and are mutually 
independent and independent from the state sequence S", the source input X" = (X"j^, -^id) ^'^'^ relay input X'^. 
Corollary 2: The capacity of the state-dependent Gaussian relay model (69) is given by 

fl / yPi\ 1 P7 ) 1 I (l->')Pi\ 

Cg = max min {- log (l + ^ l^gd + ]^)) + 2 + h^)' ^'^^ 

where the maximization is over y e [0, 1]. 

The proof of Corollary|2]follows by specializing the cut-set upper bound to the model (69) and then observing 
that this upper bound can actually be attained using a combination of binning and generalized block Markov 
scheme where we let Xir and Xid to be zero-mean Gaussian with variances yPx and (1 — y)Pi, respectively, for 
some < y < 1, independent of S and X2; X2 is zero-mean Gaussian with variance P2 independent of S; and Xir 
and Xio obtained with standard DPCs for the links to the relay and to the receiver component Y'^^ , respectively. 
The source sends information to the receiver via the relay through the dirty paper coded Xjr, and independent 
information via the direct link through the dirty paper coded Xid. 
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2) Analysis of Some Extreme Cases: We now summarize the behavior of some of the developed lower and 
upper bounds in some extreme cases. 
1) If N2 — » 0, e.g, the relay is located spatially very close to the source, the lower boimd of Theorem|5]and the 
cut-set upper bound (59\ tend asymptotically to the same value 



-)-o(l) (71) 



2 °V N3 
where o(l) — > as N2 — » 0. 

Equation l|7T) l reflects the rationale for our coding scheme for the lower bound in Theorem|5]which is tailored 
to be asymptotically optimal whenever the relay can learn with negligible distortion the input that it should 
send. In this case, the rate \71\ can be interpreted as the information between two transmit antennas which 
both know the channel state and one receive antenna. (For comparison, note that the coding scheme of 
Theorem |6] achieves rate smaller than that of Theorem |5] if N2 — > 0, because even though with the coding 
scheme of Theorem|6]as well the relay obtains the state estimate at almost no expense if N2 is arbitrarily small, 
it also needs to know the information message to perform binning, however). 

2) Arbitrarily strong channel state: In the asymptotic case Q — > 00, the capacity of the Gaussian model l l32t is given 

ec = llog(l.^). (72) 

This can be easily seen since both the upper bound of Theorem |8] and the lower bound l|65j tend to the RHS 
of l l72t in this case, which is also clearly achievable through standard DPC at the source and by turning the 
relay off. 

For the Gaussian model l ISTl l, the lower bound of Theorem [6] tends to 

The lower bound of Theorem |5] does not depend on the strength of the channel state, as we indicated 
previously. 

3) If N2 — » 00, i.e., the link to the relay is broken or too noisy, the cut-set upper bound l(59j and the lower of 
Theorem[6]agree and give the channel capacity 

eo = \iog(i + ^y (74) 

Also, the lower and upper boimds on the capacity of the model 1 1321 agree and give the channel capacity as 
the RHS of l|Z3. 

Note that, for the Gaussian model J31t , the lower of Theorem|5]is suboptimal if N2 — > 00, and tends to 



This is because the distortion in Theorem|5]is equal to its maximum value P2 in this case. Equation J75t reflects 
a limitation of our coding scheme for the lower bound in Theorem|5]if the relay fails to reconstruct the input 
described by the source. In this case, the input from the relay acts as additional noise at the destination, thus 
causing the cooperative transmission to perform less good than simple direct transmission. The achievable 
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rate (75) is, however, still better than had the state been merely treated as unknown noise if P2 < Q- (For 
comparison, note that the lower boimd of Theorem |6] vanishes if N2 — » 00). 

V. Numerical Examples and Discussion 

In this section we discuss some numerical examples, for the general Gaussian RC with informed source Bit , the 
model l l32t and the special case (62). 




, , , , , , 1 

-20 -10 10 20 30 40 50 

P/N^ [dB] 

Fig. 3. Illustration of the lower bound of Theorem |5] and lower bound of Theorem [6] for the state-dependent 
General Gaussian RC with informed source (|3T) versus the SNR in the link source-to-relay. Numerical values are: 
Pi = P2 = N3 = 10 dB and Q = 15 dB. 



Figure |3] illustrates the lower boimd of Theorem |5] and the lower bound of Theorem |6] for the model | |3T) , as 
fimctions of the signal-to-noise-ratio (SNR) at the relay, i.e., SNR = P1/N2 (in decibels). Also shown for comparison 
are the cut-set upper bound had the state been known also at the relay and the destination and the trivial lower 
bound obtained by considering the channel state as unknown noise and implementing fuU-DF at the relay. In order 
to show the effect of describing the state to the relay, the figure also shows a special case of the lower bound of 
Theorem[6]obtained by setting = in | |53) , i.e., a Gaussian version of the achievable rate il8\ that we mentioned 
in RemarklH and is a (slightly) improved version of [13, Theorem3]. 

The figure shows that the lower bound of Theorem|5]is asymptotically optimal at large SNR, and the lower bound 
of Theorem|6]is asymptotically optimal at small SNR. This shows the relevance of transmitting to the relay only a 
description of the appropriate input that it should send upon sending to it a description of the state itself at large 
SNR. At moderate SNR, however, sending a description of the state to the relay may improve upon sending to it 
a description of the appropriate Gelf 'and-Pinsker binned codeword that it should send — (How the two bounds 
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compare depends essentially on the strength of the state. For example, at large SNR, the stronger the state the larger 
the advantage of the lower boiond of Theorem|5]upon that of Theorem[6ll. Furthermore, the figure also shows that 
the lower bound of Theorem|6]is better than that of [13, TheoremS], thereby reflecting the utility of describing the 
state to the relay (recall that the coding scheme that we employed for the lower bound of Theorem[6]involves also 
a partial cancellation of the state by the source to the relay, so that the relay benefits from it and the source benefits 
in turn). Figure|4]shows similar bounds computed for an example degraded Gaussian RC. 




10 



15 



20 25 30 
P/N^ [dB] 
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Fig. 4. Illustration of the lower bound of Theorem|5]and lower bound of Theorem|6]for an example state-dependent 
degraded Gaussian RC with informed source of l ISTi , versus the SNR in the link source-to-relay. Numerical values 
are: Pi = 10 dB, = 20 dB, Q = 15 dB, N3 = 10 dB. 



Remark 13: The lower bound of Theorem |5] is asymptotically close to optimal in SNR as we mentioned in the 
"Extremes Cases Analysis" section and is visible from Figure |3] This is because the appropriate relay input, which 
is preceded at the source against the state and is encoded in a manner that it should combine coherently with the 
source transmission in next block, can be sent by the source to the relay at almost no expense in power and can 
be learned by the relay with negligible distortion in this case. One can be tempted to expect a similar behavior 
for the lower bound of Theorem [6] since, for the latter as well, the relay can learn a "good" estimate of the state at 
almost no expense in source's power and with negligible distortion. This should not be, however, since our coding 
scheme for Theorem |6] requires the relay to also decode the source's information message. Related to this aspect, 
the effect of the limitation which we mentioned in RemarklTolis visible at large SNR for this lower bound. □ 

Figure|5]illustrates the upper bound l l60t of Theorem[7]and the lower bound ((65) for the special case model d32t . 
For comparison, the figure shows also the cut-set upper bound had the state been known also at the relay and the 
destination and the trivial lower bound obtained by considering the channel state as unknown noise and using a 
generalized block Markov coding scheme as in [41]. The curves are plotted against the signal-to-noise-ratio (SNR) 
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Fig. 5. Lower and upper bounds on the capacity of the state-dependent Gaussian RC with informed source (|32). (a) 
bounds versus the SNR Pm/Nl in the Unk source-to-relay, for numerical values Pm = Pw = Pi = N3 = 10 dB, Q = 5 
and (b) boimds versus the SNR Pid/N3 in the link source-to-destination Pm = Pio = P2 = N2 = 10 dB, Q = 20 dB. 
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at the relay, i.e., SNR = P1R/N2 (in decibels). Observe that the upper bound (60) is strictly better than the cut-set 
upper bound. The improvement is due to that the upper boiand (60) accounts for some inevitable rate loss which is 
caused by not knowing the state at the relay, as we mentioned previously. Also, the improvement is visible mainly 
at small to relatively large values of SNR. 




Fig. 6. Capacity of the state-dependent Gaussian RC model (62), versus the SNR in the link source-to-relay. Numerical 
values are: Pir = 10 dB, Pm = = 20 dB, Q = 10 dB, N3 = 10 dB. 

Figure |6]illustrates the capacity result of (62) as given by Theorem|8] as function the SNR in the link source-to- 
relay of P1R/N2 (in decibels). Also shown for comparison are the cut-set upper boimd and the trivial lower boimd 
obtained by considering the channel state as unknown noise and using a generalized block Markov coding scheme 
as in [41]. 

VI. Conclusion 

In this paper, we consider a state-dependent relay channel with the channel states available noncausally at only 
the source, i.e., neither at the relay nor at the destination. We refer to this communication model as state-dependent 
RC with informed source. This setup may model some scenarios of node cooperation over wireless networks with 
some of the terminals equipped with cognition capabilities that enable estimating to high accuracy the states of 
the channel. 

We investigate this problem in the discrete memoryless (DM) case and in the Gaussian case. For both cases, we 
derive lower and upper boimds on the channel capacity. A key feature of the model we study is that, assuming 
decode-and-forward relaying, the input of the relay should be generated using binning against the state that 
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controls the channel in order to combat its effect and, at the same time, combine coherently with the source 
transmission. We develop two lower bounds on the capacity by using coding schemes which achieve this goal 
differently. In the first coding scheme, the source describes the channel state to the relay and to the destination, 
through a combined coding for multiple descriptions, binning and decode-and-forward scheme. The relay guesses 
an estimate of the transmitted information message and of the channel state and then utilizes the state estimate 
to perform cooperative binning with the source for sending the information message. The destination utilizes 
its output and the already recovered state to guess an estimate of the currently transmitted message and state 
description. In the second coding scheme, the source describes to the relay the appropriate input that the relay 
would send had the relay known the channel state. The relay then simply guesses this input and sends it in the 
appropriate subsequent block. The lower boimd obtained with this scheme achieves close to optimal for some 
special cases. 

Furthermore, the upper bounds that we establish in the discrete memoryless and the memoryless Gaussian 
cases are not trivial and account for not knowing the state at the relay and destination. Also, considering a special 
case in which the source input has two components one of which is independent of the channel state, we show that 
our upper boimd is strictly tighter than that obtained by assuming that the channel state is also available at the 
relay and the destination, i.e., the max-flow min-cut or cut-set upper boimd, and it helps characterizing the rate 
loss due to the asymmetry caused by having the channel state available at only one source encoder component. 
Also, we characterize the channel capacity fully in some cases, including when the state does not affect the channel 
to the relay. 

Appendix 

Throughout this section we denote the set of strongly jointly e-typical sequences [42, Chapter 14.2] with respect 
to the distribution Px,r as T"(Px,y)- 

A. Proof of Theorem^ 

Consider the random coding scheme that we outlined in Section|III] We now analyse the average probability of 
error. 

Analysis of Probability of Error: The average probability of error is given by 

Pr(Error) = ^ Pr(s)Pr(error|s) 

seS" 

< Pr(s)+ ^ Pr(s)Pr(error|s). (A-1) 

The first term, Pr(s ( T"(Qs)), on the RHS of l lB-5t goes to zero as n — > +oo, by the strong asymptotic equipartition 
property (AEP) [42, p. 384]. Thus, it is sufficient to upper bound the second term on the RHS of JB-Sl l. 

We now examine the probabilities of the error events associated with the encoding and decoding procedures. 
The error event is contained in the union of the following error events; where the events £i, and E2, correspond to 
encoding errors at block the events k = 3, . . . ,6, correspond to decoding errors at the relay at block /; and the 
events Ej;,, k = 7, ... ,13, correspond to decoding errors at the destination at block i. 
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Let Eli = El)' U E^f U E^f , with 

E<)' = {(s[f + 2],§K[tR;]) i niPsu)' ^ [1-2"^"]) 

E<f = j(s[f + 2],§D[tD,]) ^ TTC^'sa)' e [1.2"*°]) 

£(f = {(s[f + 2], sj,[(j,,], sd[;d,]) ^ niPs.s^.O' M all Um, id,) e [1, 2«^«] x [1, 2"*°]). (A-2) 

From known results in rate distortion theory [42, p. 336], it follows that P(£j)') — > exponentionally with 
n if Rr > I(S; Sr). Similarly, P(E*^') — > exponentionally with n if Ro > I{S; So)- It remains to show that 
P(Ej^') — > exponentionally with n if + > J(S; Sr, Sd) + I{Sr; Sd), which we prove by following the 
arguments in [35]. 
Define the random set, 

A,iM] = {(iw, tDi) e [1, 2"*'^] X [1, 2"*°] si. : (s[z + 2], sj^Ii^], §d[id;]) e T,"(Ps_j^^^ j}. (A-3) 
Then, we have 

and, using Chebychev's inequaUty, 

P(IK[i+2]ll = 0) < P(|||A[i+2]ll - E[A[i+2]]l > eE[A[i+2]]) 



var(|K[i+2]ll 



- e2(E[A[i+2]])2' 

Now, to obtain bounds on E[yis[i+2]] and var(||yis[i+2]||), we define the indicator functions. 



(A-5) 



MilRi, iDi) G As[i+2]) = \ (A-6) 

' 0, otherwise. 



The cardinality of the set yis[i+2] is given by 

ll^s[i+2]ll = X ^(('^'' ^ -^sli+y)- (A-7) 

Or, 
Thus, 

2«*K 2"*D 

E[||A[i+2]ll] = X Z El((tRi,tDi) G A[i+2l) 

IR<=1 (Di=l 

> 2nlRR+fiD-H{iR)-H{SD)+H(SR;SD\S)-e-2i{e)]_ ^^.g^ 

Similarly, one can show that 

var(||yis[i+2] II) < 2«t*'=-^«''-«(S'''-"(^D)+H(Ss;So|S)-e+26(e)l _ (^.jq) 
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Thus, the above Chebychev's inequality yields 

-P(ll-As[i+2]ll = 0) < A2-"[*«+*o-«(SR)-H(SD)+H(SK;SD|S)^e-6S(e)] (A-11) 

Then, P(||yis[,+2]ll = 0) — > as « — > oo if 

Rr+Rd> H{Sr) + H{Sd) - H{Sr; Sd\S) - e - 66(e) 

= I{S; Sr, Sd) + I{Sr; Sd) - e - 66(e). (A-12) 

Let £2/ be the event that there is no pair (uRizVi^i, j*^,io„ j^^,k„ jR,),UD{iOi-i, jy^,iv„ j^^Ji, jo,)) satisfying ((lOj, 
i.e., the set D,|j,,j„ is empty. 

Again, using Chebychev's inequality, it is easy to see that 

= 0) < P(|||D,«,,„|| - E[D„,,„]| > eE[D,„,„]) 
var(||D„,„||) 



We obtain bounds on E[D,,;,,|^J and var(||D,,;,,Q,||) by proceeding in a way similar to that for the event £1,. We 
define the indicator fimctions, 

i[{ur{iu,^i, 7* , w„ ;■*,, k„ jR,)MD{zv,-i, iv„ 7*,, l„ ju,)) e K,„,d,) = 

j 1, if (uR(iy,_i, 7* ., iv„ 7*^., fc„ iR,), uu(iv,-i, 7*,, iv„ i*^, l„ ju,)) e 
0, otherwise. 
The cardinality of the set Di^.^^. is given by 

= Yj Yj iv,' iu,' fc' h'), UD(ro,-i, 7*,v zvu j^,, h, 7d,)) e (A-15) 

Thus, 

El©,,;,,,!,] = ^ 2], Ei((ur(z(;,-i, 7* ., iy„ /*,., fc„ /r,), UD(iy,-i, 7*,, ii^,, /y,, /„ 70,)) e 

> P, line, \\Jj^Jjj2~"^^^^'''^'^'''^°^^''^^*^^^°'^^^^^^ 

_ 2n[RR+RD-RR-RD-i(UR;UD|U,V,S,SR,SD)-o(l)] (A-16) 

where o(l) — > as w — > 00. 
Evaluating the variance, we obtain that 

var(||D, ^ < 2"['^R+^D~j^«"*D"-f("K;"Dl"'^'S'SR.SD)+o(i)] (A-17) 
Therefore, for sufficiently large n 

P(P.R,.o,ll = 0)<e (A-18) 

provided that ([TTJ is true. 
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. Let E3, be the event that u(m;,_i, j*^, Wi, /*;), UR{Wi-i, j*^, Wi, k, ;* ) are not jointly typical with {yili], SR[tRi-2]) 
given v(ti;,_i, ;'*.). That is 

Em = {{y{wi-i, ;^;), u(w;,_i, ;*;, Wi, ;*;), uj?(iyi_i, ;*;, Wi, /*;, fc;, ;*•)/ y2[!l sj?[im-2]) ^ ■r^(Py,u,u5,y2,sj}- (A-19) 

For v{wi-i, i*.), u{wi-i, j*., Wi, ]*■), \iR{wi-i, 7* , Wi, /* ., k, 7*), UD(wi-i, Wi, j^^, h, j^) jointly typical with s[f], 
§R[tRi-2], sd[(di-2] and with the source input xi[z] and the relay input X2[t\, we have Pr(E3i|E5;,E2;) — > as 
n — > 00 by the Markov Lemma [42, p. 436]. 
. Let E4, be the event that u{w,^i, j*.,zv'., jud, nR{w,-i, j*.,iu'., ju„k„ jui) are jointly typical with (y2[!],SR[tR,-2]) 
given v(iy,_i, ;*.), for some w'. e [1,M], jui G /u, fc; e [1,Mr] and e /«, with w'^ 4^ Wu That is, 

E4i = {a 6 [1,M], ;ui e ]u,h e [l,MR],;Ri e ]r s.i: 9^ a;,., 

(v(w,_i, u(w,_i, ;•* ., w;, iuxl, UR(roi-i, /y;, TO-, ;ui, h, jm), yzli], §R[tRi-2]) e T^{Pv,u,Ur,Y2A^}- (A-20) 

Using the union bound and standard arguments on jointly typical sequences, the probability of the event £4,- 
conditioned on EJ ., E2;, E'^^ can be easily bounded as 

Pr(E4,|E5,,E^,,Ey < M/uMr/r2-«W'^''^'''^-«''I^-^1 

_ 2-nU(U;Y2\V,SR)~I{UASD\V,SR)-R+ie]_ (A-21) 

Thus, Pr(E4,|E5., E^., E^^.) ^Oasw^ooifR< I{U; Y^V, Sr) - I{U; S, Sd\V, Sr). 
. Let Esi be the event that u(iy,_i, 7* , iy„ UR{zVi^i, j*^,w„ i'^j^,k„ jRi) are jointly typical with (y2[!lsR[tRi_2]) 
given v(wi-i, j*^), for some y'^. e /u, ki e [1,Mr], € Jr with ^ /* .. That is, 

Esi = {3 j'ui e /u/'ci e [I/Mr],;^ e /r s.t. /y; 

[\{wi-i, /*;), u(w,_i, ., Wi, UR{wi-i, w;, fci, 7Ri), y2[zl §R[tRi-2]) e r^(^'y,uus,y2,SR)}- (^-22) 

Conditioned on the events Ej^., Ej;, E^^., E^^., the probability of the event Es; can be bounded using the union 
bound, as 

Pr(E5,|E^i,,E^,,E^,,E^,.) < /uMr/r2-«W"'"«'^^'^''I'^-^1 

_ 2-nU(U;Y2\V,SR)-I(U;S,SD\V,SR)+3e]^ (A-23) 

Thus, Pr(E5,|E5.,E^.,E^^,E^.) ^ as n ^ 00. 
. Let Eg, be the event that URiWi^i, i*.,iv„ j*^,k'^, jR,) is jointly typical with (y2[2],SR[tRi_2]) given v(Wi-i,j*^), 
u{wi-i, Wi, ;'*;), for some t e [1, Mr], e Jr with t + k. That is. 

Eft = ja ^ € [1,Mr], jRi e }r s.t. fc; 9t fci, 

[v{wi-i, ;•*.), u(w,_i, ., Wi, ;■*,.), xxR(wi-i, j*-, Wi, fcj, /r;), y2[z], §R[tR,-2]) e T^{Pv,u,Ur,Y2,Sr)}- (^'24) 

Conditioned on the events E^ ., E^^, E^., E^., E^., the probability of the event Eg, can be bounded using the union 
bound, as 

Pr(E6,|E5;,E^,,E5;,E^;,Ey < Mr/r2-"W"^'^^'«»I"'^-1 

= 2-"('^). (A-25) 
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Thus, Pr(E6,|E5,, E^,, E^,., E^,, E^,) ^ as n ^ oo. 

• For decoding the triple {Wi-i, jui-ii k-i) and the index ;Vi at the destination, let Eji be the union of the following 
two events 

= ;V(SR[tR,-3], ■Wi-2)), n{Wi-2, jviSRbm-s], Wi-l), ^i-l, jw-i), 

For v(a;,_2, u(a;,-2, 7;^,_i, 7u,_i), UR(iy,-2, 7y,_i, ^",-1, k^i, ud(ii',-2, /.--i, 
jointly typical with s[; - 1], SR[tR,_3], soltDi-s] and with the source input xi[; - 1] and the relay input X2[! - 1], 
we have Pr(E7^'| n^^^ Ey — > as n — > oo by the Markov Lemma. Similarly, Pr(E*,^'| n^^^ E^) — > as n — > oo. 
Thus, Pr(E7i| n^^j Ey — > as n — > oo. 

• For decoding the triple ]ui-\, U-i) and the index /y; at the destination, let Eg; be the event 

Esi = {a e [l,M],;ui_i e Ju,li-i g [1,Md],7d,-i e /d,;V, e Jy s.t: w-.^ Wi-i, 

{y{wi-2, u(roi_2, w'._^, iui-x), ud(w,-2, ;y,_i, jm-i, h-i, ;'d,-i), ysE? - 1], SoItRi-s]) e ^^(^KuuD^rs,^^)' 

(v«i,;V,),y3[j],§D[tD,-2]) e T,"(Pyy3 ^j). 
Conditioned on n^^^E^ the probability of the event Es; can be bounded using the union boxmd, as 
Pr(E8,| n^^i £[.) < M/uMD/D/y2-"[^('^''^°'^^'Sol'')-^J2'"[^('''^^'^°)^^l 

_ 2-nmKU;r3|SD)-I(KiJ;S,SE|SD)-R-[I(L/;Y3,SD|V0-I(U;S,SR,SD|y)]-+2e] _ (A-26) 

Thus, Pr(E8,| n[^j Ey ^ as n ^ c» if K < y; Ys, ^d) - I{U, V; S, Sr, Sd). 

• For decoding the triple (wi-i, jui-i> k-i) and the index ;Vi at the destination, let Eg, be the event 

Em = {3 ivi G /v s.t.: /y, i= /*., 

(v(w,_2, ;y,_i), u(iyi_2, 7y,_i, ud(w,_2, ;y,_i, Wi-i, ;d,-i), ysl^ - 1], soItK-s]) g r^(f'y,u,UD,Y3,SD)' 

(v(w,_l,;Vi),y3[zl§D[tDi-2]) G T^(Py,y3,SD)}- 

Conditioned on n^^^Ey the probability of the event Eg, can be boxmded using the union boxmd, as 

Pr(E9i| El.) < /^,2-«[^('^^^-So)-.] 

_ 2-''U(V;Y3,SD)-I(V;S,&R,SD)-2e]_ (A-27) 

Thus, Pr(E9il n^^^ Ey ^ as n ^ c» if I{V; Y3, §d) - S, §r, Sd) > 2e. 

• For decoding the triple /ui-i/ U-i) and the index /y, at the destination, let Eioi be the event 



Em = (3 e Ju,li-i g [1,Md],;d;-i g /D,;y, g /y s.t: /y, it 

(v(M;i_2, ;y;_i), u(M;i_2, Wi-i, Ud(w,-2, ;yi_i, m-l, fui-V /Oi-l)' ysl^ - 1]/ SD[tK-3]) G r^(Py,u,Uo,y3,So), 

(v(roi_l,/yi),y3[i],§D[tDi-2]) G T^(Py,y3,§o)}- 
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Conditioned on n^ ^E^^., the probability of the event Eio, can be bounded using the union bound, as 

_ 2^"lHUy;YSD)-I{Uy:SA\SD)-mU:Y3.SD\V)-I{U;S,SR,SD\V)]-+e]_ (A-28) 

Thus, Pr(Eio,i n,^^j £y — > as m — » oo. 
» For decoding the triple jui-i, h-i) and the index jvi at the destination, let En, be the event 

Ell/ = {3 7'y,_i G }u,l,-i e [1,Md],/d,-i e /D,/t/, e Jv s.t: 

{v{io,-2, iv,-i)' u(^'''-2' /v,-i/ ^i'l-i/ i'w-i)' UD(iy,-2, W/-1, 7'u,_i, h-i, joi-i), YiU - 1], sd[ir,-3]) e y,u,UD,r3,SD)' 

(v(ti;,-i,/*,),y3[i],SD[;D,-2]) e V^iPy^y^^sj}- 
Conditioned on n^^^E^., the probability of the event En, can be bounded using the union bound, as 
Pr(Eii,| nli^ El,) < /^MD/D2-"m"'"-^-SoH/)-.I 

_ 2-nlKU:Y3\V,SD)-msSR\V,SDyiI(U:Y3SD\Vyi{U;SSRA^\V)]-+3e]_ (A-29) 

Thus, Pr(Eii,| ni°j Ey — > as n — > 00. 
• For decoding the triple (j&,-i, yui-i, /,-i) and the index jvi at the destination, let Ei2; be the event 



Em = [3 G [1,Md],/d,-i g /Dr/w e /y s.t.: ;,_i, /y, 7*., 

(v(iy,-2, /y,_i), u(z(;,-2, /*,_i, if,-ir 7u,-i)/ "0(1^,-2, 7y,_i, w,-_i, 7*,_i, 7d,--i), ysfi - 1], sd[(r,-3]) g y,u,UD,r3,SD)' 

(v(iy,-l,7w)/y3[!lSD[lD,-2]) G T^'(fy,Y3,SD))- 

Conditioned on n^^jE^., the probability of the event E12, can be bounded using the union bound, as 
Pr(Ei2,| nli^ El,) < MD/D/y2-"W"°^^-Sol"'^-^l2-«W^^^-So)-.] 

_ 2^iiKV:Y3,SD)~W:S,SR,SD)-lI{U:Y3,SD\V)-I{U:SSR,Sa\V)]-+2£]_ (A-30) 

Thus, Pr(Ei2,i n^^j Ey — > as n — > 00. 

For decoding the triple (w,-i, jui-ii h-i) and the index 7V, at the destination, let E13; be the event 



Ei3,- = {3 Z;_i G [1,Md],7d,-i g jD,jv, G }v s.t.: r,_^ + Z,_i, 

(v(ri',-2, 7y,-i)' u(ri;,-2, 7y,-i' 7u,-i)' UD(fi',-2, 7y,-i' 7u,-i' ','-i' 7d,-i), y^Xi - 1], sd[(r,-3]) e T^"{^v,u,Uo;y3,Q' 

(v(t^;,_i,7*,),y3[f],SD[iD,-2]) G Vl{Vy^^ Q\ 
Conditioned on Pi^^^E^ the probability of the event E13, can be bounded using the union bound, as 

Pr(Ei3,| ni2j Ey < MD/D2-"t^<"°^^-*'^l'^'^'"^J 

_ 2-n[-K(li;y3,SD|l')-/(U;S,Ss,SD|V)+4e]_ (A-31) 

Thus, Pr(Ei3,| ni2^ Ey ^ as w ^ 00. 
This concludes the proof of Theorem[T] 

January 19, 2013 DRAFT 



35 



B. Proof of Theorem |2] 

First we generate a random codebook that we use to obtain the lower bound in Theorem|2l This scheme is based 
on a combination of block Markov coding [33], Gel'fand-Pinsker binning [2], and classic rate distortion theory [42, 
Chapter 13]. Next, we outline the encoding and decoding procedures. 

We transmit in B blocks, each of length n. During each of the first B blocks, the source encodes a message 
iVi e [1, 2"*^] and sends it over the channel, where i = 1, . . . ,B denotes the index of the block. For convenience we 
let ivg+i = 1. For fixed n, the average rate over B + 1 blocks approaches R as B — > +oo. 

Codebook generation: Fix a measure f s Xi xx the form l(23). Calculate the marginal Pj^ induced by 

this measure. Fix e > and let 

J = 2«[^("'S)+2'!] jj^ = 2«W"K;i-ts)+2e] (B-la) 

M = 2"^^-*"^ Mr = 2"t*-4^1. (B-lb) 



1) We generate }M independent and identically distributed (i.i.d.) codewords {u{iv, j)} indexed hy zv = 1, . . . , M, 
/ = 1, . . . , /, each with i.i.d. components drawn according to Pjj. 

2) We generate JrMr i.i.d. codewords |uR(ffi, indexed by m = 1,...,Mr, Jr = 1, each with i.i.d. 
components drawn according to Pus. 

3) Independently, we randomly generate a rate distortion codebook consisting of Mr sequences x drawn i.i.d. 
according to the m— product of the marginal P-^. We index these sequences as x[m], m = 1, . . . , Mr. 

Encoding: We pick up the story in block i. Let Wj G |1, . . . ,M} be the new message to be sent from the source 
node at the beginning of block /, and e jl, . . .,M} the message to be sent in the next block i + 1 (note that we 
can assume that zf, iVi+i, as the indices {iV]^] are assumed i.i.d. on jl, . . . , 2"^}, and so Fr{iVj = = 2^^"^ — > as 
n — » +oo). The encoding at the beginning of block i is as follows. 

i) The source searches for the smallest j € jl, • • • , /) such that u(ro/, /) is jointly typical with s[i]. (The properties 
of strongly typical sequences guarantee that there exists one such /). Denote this /' by /* = 7(s[/], ivi). 

ii) Similarly, the source finds j*^^ = i(sli + l],zVi+i) such that u(w,+i,7*^j) is jointly typical with s[f + 1] and 
then generates a vector x[z(;,+i] with i.i.d. components given u(tf,+i, j*^-^) and s[z + 1], drawn according to the 
marginal Px\u.s- 

m) Then, the source indices x[w,+i] by m, if there exists an m, e jl, . . . ,Mr) such that x[zf,+i] and x[m,] are jointly 
strongly typical. If there is more than one such m,, the source selects the first in lexicographic order. If there is 
no such m„ let m, = 1. Shannon's rate-distortion theory [42, Chapter 13] ensures that the encoding of x[w,+i] 
is accomplished successfully with high probability provided that n is sufficiently large and 

R>I{X;X). (B-2) 

iv) Next, the source looks for the smallest Jr e jl, • ■ ■ , ]r] such that ur(ot,, Jr) is jointly typical with (s[i], u(w„ j*)). 
(Again, the properties of strongly typical sequences guarantee that there exists one such Jr). Denote this Jr 
by /r, = 7r(s[;], u{iv„ /*)). 
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Continuing with the strategy. Let mg = 1. The encoding at the beginning of block i is as follows. 

1) The relay knows m,_i (this will be justified below), and sends X2[i] = x[m,_i]. 

2) The source transmits the pair (Wi, nii). It sends a vector xi [;'] with i.i.d. components given the vectors u(iy,, /*), 
Ur(?m„ 7^;) and s[i], drawn according to the marginal Pxi|u,Ur,s induced by the distribution (23). 
Decoding: The reconstruction of the vector x[iy,+i] at the relay and the decoding procedure at destination at 
the end of block /, are as follows. 

1) The relay knows m/_i and estimates m, from the received y2[i\. It declares that m, is sent if there is a unique 
m, e {1, . . .,Mr) such that UR(m„7'R,) and yi[i] are jointly typical for some g {!,...,/«). One can show that 
the decoding error in this step is small for sufficiently large n if 

R<mK;Y2)-l{UK;U,S) 

= I{Ur; Yz) - I{Ur; S) - I{Ur; U\S). (B-3) 

2) The destination estimates , from the received [i] . It declares that ro, is sent if there is a unique iVi € j 1, . . . , Mj 
such that u{Wi, ji) and y3[/] are jointly typical for some /, e jl, . . . , /). One can show that the decoding error in 
this step is small for sufficiently large n if 

R < I(U; Y3) - I{U; S). (B-4) 

Analysis of Probability of Error: Fix a probability distribution PsuUrX-^ X2XXY2 Y3 satisfying l l23l l. Let s[/] and 
{zvi, mi) be the state sequence in block i and the message pair sent from the source node in block i, respectively. As 
we already mentioned above, at the beginning of block i the source transmits xi(if,, mi) and the relay transmits 
X2[i] = x[m,-i]. 

The average probability of error is such that 

Pr(Error) < ^^"(s) + Pr(s)Pr(error|s). (B-5) 

The first term, Pr(s i T"(Qs)), on the RHS of jB-5t goes to zero as n ^ 00, by the asymptotic equipartition property 
(AEP) [42, p. 384 ]. Thus, it is sufficient to upper bound the second term on the RHS of jB-5t . 

We now examine the probabilities of the error events associated with the encoding and decoding procedures. 
The error event is contained in the union of the following error events; where the events Ei„ £2/ and £3, correspond 
to encoding errors at block i; the events £4, and £5, correspond to decoding errors at the relay at block i; and the 
events £5, and £7, correspond to decoding errors at the destination at block i. 

• Let £1, be the event that there is no sequence u(z€',, 7) jointly typical with s[i], i.e., 

£1, = / e |1, . . . , 71 s.t. (u(m;„ /), sH) e V^{Pu,s)\ 

To bound the probability of the event En, we use a standard argument [2]. More specifically, for u(ry„ /) and s[f] 
generated independently with i.i.d. components drawn according to Py and Qs, respectively, the probability 
that VL(Wi, j) is jointly typical with s[/] is greater than (1 — e)2""(^''^'^'+''' for sufficiently large n. There is a total 
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of / such u's in each bin. The probabiHty of the event En, the probability that there is no such u, is therefore 
bounded as 

Pr(£i,) < [1 - (1 - e)2-"«^'S)+i?)j/^ ^g.^^ 

Taking the logarithm on both sides of l lB-6t and substituting / using (M) we obtain ln(Pr(£i,)) < -(1 - e)2"^ 
Thus, Pr(Ei,) — » as « — > oo. 

• Let E2i be the event that there is no sequence u(w,+i, /') jointly typical with s[z + 1], and £3, the event that there 
is no sequence UR(m,, j^) jointly typical with {s[i], u(zy„ /*)). Proceeding similarly to for the event En, it can be 
easily shown that, conditioned on £j . and £!| . n E^-, respectively, these tow events have vanishing probabilities 
as M — » +00. 

• For the decoding at the relay, let £4, be the event that UR(m„ y'*.) is not jointly typical with y2[z]- That is 

£4, = {(uR(m„ ;■*,), y2[i]) i r^{Pu,„YaA (^-^ 

For u(w,, /*), Ux(ffi„ y'*.) jointly typical with s[;], and with the source input xi[f] and the relay input X2[f], we 
have Pr(£4,|£5^., £^^., £y — > as n — > 00 by the Markov Lemma [42, p. 436]. 

• For the decoding at the relay, let £5, be the event that UR{m'., jRi) is jointly typical with y2[/] for some m'. e [1, Mr] 
and jm e /r, with m'. 4^ nti. That is, 

£5, = {3 m; e [1,Mr],;r, e /r s.t. m\ m„ 

(uR(m:, 7R,), y2[i]) e T^{Pu,.y„x)]- 

Conditioned on the events £j-, E^,, £3^ and £^., the probability of the event £5, can be bounded using the union 
bound, as 

Pr(£5i|£5,,£^,,£^,,£y < Mr/r2-"W'-'«^^^)-^1 

_ 2-«V(UR;Y2)-l(U,c,U,S)-R+e] ^ 

Thus, Pr(£3i|£5,., £^,., £^,, £^,.) ^Oas«^ooifR< l{Ur, Yi) - I{Ur; S) - I{Ur; U\S). 

• For the decoding at the destination, let £5, be the event that u(iy,, j*) is not jointly typical with y3[/]. That is 

£6, = {{u{w„ j*), yM) i r^-CP^y,)). (B-10) 

For u(iy,, /'*), WR^nii, j*.) jointly typical with sli], and with the source input xi[i] and the relay input X2li], we 
have Pr(£6,i£5,., £^,., £^,-, £^,., £^,-) — > as « — > 00 by the Markov Lemma [42, p. 436]. 

• For the decoding at the destination, let £7, be the event that \i(iu'., ji) is jointly typical with y3[f] for some 
zv. G [1,M] and e /, with w'. + k,. That is, 

E7, = {3 zu'^€ll,M],j,€] s.t w',i^k„ 

(uK,7,),y3[i])er:-(Pu,Y3)). (B-11) 
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Conditioned on the events Ey, E"^., E^., £^., E^. and E^., the probability of the event E7, can be bounded using 
the union bound, as 

Pr(E7,|E5,, E^,, E^,, E\, E^,, Ey < 

^ 2-nm;Yi)-I(U;S)~R+e] (B-12) 

Thus, Pr(E7,|E5,, E^,, E^,., E^,, E^^., E^ ^ as « ^ +(x) if < I{U; Y3) - liU; S). 
This concludes the proof of Theorem|2l 

C. Proofs of Theorem \3\ 

Let an (e„, n, R) code be given. By Fano's inequality, we have 

nR = H{W) 

<I{W;Y'^) + l + nR£„. (C-1) 

Let us define U, = {S1^^, Y'-\ Y'-^) and V, = (W, S'l^^, Y'-^), i = l,...,n. 
We have 

I(W; y;,') < I{W; Y\, Y'D 

= I(W; Yl, YD - Z(W; S") (C-2) 

n 

/-I 

n 

= m SI,; Y2„ Ys^irr'' n^') - ■f(S"+i; ^2,,, Ys.Iw, y^-i) - j(W; s,|s;;i) 

m SI,; Y2,u Y,,\Y'^-\ y^-i) - I{S,; Y'^\ Y'-^\W, S^,) - I{W; S.IS^,) 

1=1 

n 

= Y m, sii; V3,|yr\ ^r') - ks.; w, y'^-\ yr'is;;i) 

n 

= Yj ^(W; ^2,. Y3,,|S^i, yr', ^r') + Y3,\Y'2-\ y;') - I{S,; Y'^-\ Y','\Sl,) - I{Sr, w\si„ Y'^\ y^i) 

1=1 

n 

n 

Y ^2-" ^^.'\^iv ^1^' n"'' ^2,,) - i{s,; msu, yr', n-', 

1=1 

n 

= Y ^i^r, Y2,,, Y3,,|Lr„ X2,,) - I(t7,; S,\U„ X2,i) (C-3) 
1=1 

where: (a) follows since message W is independent of the state S" ; (b) follows from Csiszar and Korner 's "summation 
by parts"-lemma [43] 

n n 

Y ^^^li' ^2,<' ^3,,l w, y-1, y-i) = Y y'l^ y^'m s'U) (c-4) 
1=1 1=1 
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(d) 

1=1 



39 



(c) follows similarly, from Csiszar and Korner's "summation by parts" 

H n 



(d) follows from the fact that Xa; is a deterministic function of ^. 
Similarly, 



/(W; y;,") < I{W, Sl„ Y3,,) - I(W, S;"^i, Y^-1; S,) 

= 2^/(t/,;Y3,,)-I(t7,;S,) (C-6) 

!=1 

where (e) follows exactly as in the converse part of the proof of the capacity of Gel'fand-Pinsker channel [2] by 
replacing Y" with Y!^ . 

From the above, we have 

R < _ y /(y,; Y2,u Y3,,|Q„ X2,,) - I{V,; S,|!J„ X2,,) + 1 + nRe,, 
n *— ' 

!=1 

< - y Y3,,) - m-, S,) + 1 + «-Re„ (C-7) 
n ' 

!=1 

We introduce a random variable T which is imiformly distributed over {!,••• ,n]. Set S = Sr, U = Uj, V = Vj, 
Xi = Xij, X2 = X2,r, Y2 = Y2,r, and Y3 = Y3 j-. We substitute T into the above bounds. Considering the first bound 
in l lC-7b , we have 



1 " 

- y I{V,; Yi,, Y3,,|a„ X2,,) - I{Vr, S,|Q„ X2,,) 



1=1 

= J(V; Y2, Y3IO, X2, T) - J(t7; S|Q, X2, T) 

= 1(7, t/; Y2, Y3IQ, X2) - /(T; Y2, Y3IQ, X2) - I(T, F; S|Q, X2) + I(T; S|Q, X2) 
< J(r, y; Y2, Y3IQ, X2) - I(r, V; SIO, X2) + /(T; S|0, X2) 

= /(T, t7; Y2, Y3IO, X2) - I(r, V; S|0, X2) (C-8) 

where in the last equality we used the fact that T is independent of all the other variables. 
Similarly, considering the second bound in jC-7t , we obtain 



-V j(y,;Y3,,)--r(v,;S,) 

n i—i 



1=1 

= J(t7;Y3|T)-J(t/;S|r) 

= J(r, V; Y3) - Z(T; Y3) - I(T, V; S) + /(T; S) 

< J(r, V; Y3) - Z(T, y; S). (C-9) 
Let us now define U = U and V = (T, V). Using iC-7\ , iC-8\ and | |C-9| I, we then get 

R < I{V; Y2, Y3IU, X2) - I{V; S\U, X2) + 1 + nRe,, 



January 19, 2013 



DRAFT 



40 



R<I(V; Y3) - I{V; S) + 1 + nR£„ . (C-10) 

So far we have shown that, for a given sequence of (e„, n, R)— codes with e„ going to zero as n goes to infinity, 
there exists a probability distribution of the form (26) such that the rate R essentially satisfies l l25t . This completes 
the proof of Theorem|3l 

It remains to show that the rate (25) is not altered if one restricts the random variables !i and U to have their 
alphabet sizes limited as indicated in l l27t . This is done by invoking the support lemma [44, p. 310]. Fix a distribution 
f( of (S, LT, V, Xi,X2, Y2, V3) on TCSxlIxVxlixXzxyzxys) that has the form (26). 
To prove the bound (27a) on \IL\, note that we have 

Y2, X2) - S\U, X2) 
= X2; Y2, Y3\U) - J,,(X2; Yz, Y3\U) - X2; S\U) + ^Qir, S\U) 

= H^,(Y2,Y3\U) - H^,(V,X2,Y2,Y3\U) + Hf,{V,X2,S\U) + H^,(X2\U) - H^{X2,S\U). (C-11) 

Hence, it suffices to show that the following functionals of |U(S, U, V, Xi, X2, Y2, Y3) 

n.x,x'{lL) = f (s, X, x') V (s, X, x') e SXX1XX2 (C-12a) 

ri(^0 = r df,(u)[H,,(y2, Yslu) - Hf,(y, X2, Y2, YjIm) + X2, S\u) + H,,(X2|m) - H,,(X2, S\u)] (C-12b) 

can be preserved with another measure /i' that has the form (26). Observing that there is a total of |S||Xi||!)C2| 
functionals in JC-12b , this is ensured by a standard application of the support lemma; and this shows that the 
cardinality of the alphabet of the auxiliary random variable Ui can be limited as indicated in (27a) without altering 
the rate (25). 

Once the alphabet of !i is fixed, we apply similar arguments to bound the alphabet of V, where this time 
(|S||Xi||X2l)^ - 1 functionals must be satisfied in order to preserve the joint distribution of (S, U,Xi,X2), and one 
more functional to preserve 

W' ^3) - h'(^' = H^X^3) - H,,(S) - H^.{Y3\V) + Hf,(S|y), (C-13) 
yielding the bound indicated in l l27bb . This completes the proof of Theorem|3l 

D. Proof of Theorem^ 

We prove that for any (e, n, R) code consisting of a mapping cp" = {(plj^, hyper source with (p"j^ : W — > 

and (p"j^ : WxS" — > DC^^ , a sequence of mappings (p2,i '■ — > '^2, i = 1, ■ ■ . ,n, at the relay, and a mapping 
ip" : y" — » W at the decoder with average error probability P" — > as w — > 0, the rate R must satisfy (28) . 
By Fano's inequality, we have 

H{W\Y'^) < nRe„ + 1 = nb„. (D-1) 

Thus, 

nR = H(W) < I(W; Y'^) + nbn 
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(D-2) 

We now upper bound I(W; Yj) as in the following lemma, the proof of which follows. 
Lemma 1: 

n 

i) I(W; y;,') < J(XiR,,; Y^,^S„ + Z(Xid,,; Ys^IS,, X2,) (D-3a) 

n 

ii) I(W; Y3") < I(XiD,,; Y3,,|S„ X2,,) + /(Xz,,; Y3,,). (D-3b) 

(=1 

Proo/' To simplify the notation, we use S' = {S\,S^,■■■ /S,), YJ. = (Y^,!, Y^^z/ • • • ,Yi:,/), = 2,3, and XJ. = 
(X^,i,Xp,---,X,;,),7 = l_R,lD,2. 

1) The proof of the bound on I( W; Y^') given in i) follows straightforwardly by revealing the state to the destination 
and using the charmel structure 

(a) 



I(W; YD < J(XiR,,-, XiD,,; Y2,,, Y3,,|X2,„ S,) (D-4) 
1=1 

n 

= ^(^iR-'-' ^iD,.' ^2,,|X2,„ S,) + I{XiR,„ XiD,,; Y3,,|X2,„ S„ Y2,,) (D-5) 
1=1 

= Y ^(^^R-'' + ^(^ID'- ^2,|XiR„ X2,, S,) 

!=1 

+ I{XiR,„ XiD,,; Y3,,|X2,„ S„ Y2,,) (D-6) 

n 

i 2^ I{XiR,,; Y2,,\X2,„ S,) + J(XiR,„ XiD,,; Y3,,|X2,„ S„ Y2,,) (D-7) 
1=1 

= 2^ J(XiR,,; Y2,,|X2,„ S,) + H(Y3,,|X2,„ S„ Y2,,) - H(Y3,,|Xir,„ Xid,„ X2,,, S„ Y2,,) (D-8) 

!=1 

n 

i 2^ I(XiR,,; Y2,,|X2,„ S,) + H(Y3,,|X2,„ S„ Y2,) - H(Y3,|Xid,„ X2,,, S,) (D-9) 

1=1 

(d) " 

< 2^ J(XiR,,-; Y2,,|X2,„ S,) + H(Y3,,|X2,„ S,) - H(Y3,,|Xid,/, X2,,, S,) (D-10) 
1=1 

n 

= J] J(XiR,,; Y2,;|X2,„ S,) + I(XiD,,; Y3,,|X2,„ S,) (D-11) 

!"=1 

where: 

(fl) follows trivially by revealing the state to the destination; (b) follows since Xid / <-> (Xirj, X2J, S,) <-> Y2,,-; (c) 
follows since {Xirj, Y2,,) (Xid,/, X2,,, S,) <-> Y3,/; and (d) follows since conditioning reduces entropy. 
2) The proof of the bound on I(W; Y^) given in ii) follows as follows. 

I{W; Y'D = I(W, S"; Y'^) - Z(S"; Y'^IW) 
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= ( I{W, S"; y3,,\y'i^)) - H{S"\W) + H{S"\W, Y"^) 

i=l 

n 

i h{Y3,,\y;-') - H{Y3M s", y^i) - H(s,) + H{sm y;, s'-^) 

!=1 

(/) ^ 

< 2_^H{Y3,d-H{Y3,,\XiD„X2,„S,)-H{S,)+H{SMy3,S'-^) 

i=l 
n 

''i H{Y3,d - H(Y3,,|XiD,„ X2,„ S,) - H(S,) + H(S,| W, Y", S'-\ Y^^) 

i=l 

^' H(Y3,,) - H(Y3,,|XiD,„ X2,„ S,) - H{S,) + H(S,| W, Y3", S'~i, Y^-i, X2,,) 

!=1 

< 2^ ^(XiD,„ X2,,, S,; Y3,,) - H(S,) + H(S,|X2,„ Y3,,) 

!=1 

(0 " 

< ^(XiD,„ X2,„ S,; Y3,,) - H(S,) + H(S,|X2,„ Y3,,) 

!=1 
H 

= Y ^(XiD,„ X2,,, S,; Y3,,) - l{Sr, X2,,, Y3,,) 

!=1 

n 

= Y liXlD,:-, Y3,,\S„X2,,) + I{X2,r, Y3,,) - I(X2,,; S,) 

!=1 

2^ J(XiD,,; Y3,,|S„ X2,,) + /(X2,,; Y3,,), (D-12) 

1=1 

where: (e) follows from the fact that the state S" is i.i.d. and is independent of the message W; (/) follows from 
(W,S",Y^-i) <^ (XiD,,,X2,„S,) <^ Y3,, is a Markov chain; (^) follows from Y'^^ <^ {W,S'-\Y"^) <^ S, is a Markov 
chain; (h) follows from the fact that X2,, is a deterministic fimction of Y'^^; (i) follows from the fact that conditioning 
reduces entropy; and (/) holds since X2 , is independent of S, . ■ 
We introduce a random variable T which is uniformly distributed over {!,••• ,n}. Set S = St, Xir = Xirj, 
XiD = XiDj, X2 = X2J, Y2 = Y2J, and Y3 = Y3 j. We substitute T into the above bounds. Considering the bound 
l|D-12)l . we obtain 

- y l{Xiu,u Y3,,\S„ X2,,) + I(X2,,; Y3,,) 
n *— ' 

1=1 

= I(Xid;Y3|S,X2,T) + Z(X2;Y3|T) 

= I(XiD, X2, S; YalT) - J(S; X2, YjIT) (D-13) 

and, similarly, 

- y I{X,R,r, Y2,,\X2,u S,) + I(XiD,,; Y3,,|X2,„ S,) 
n *— ' 

= /(Xir; Y2IS, X2, T) + Z(Xid; Y3IS, X2, T) (D-14) 
where the distribution on (T, S, Xir, Xid, X2, Y2, Y3) from a given code is of the form 

-Pr,S,XiR,XiD,X2,Y2,r3 = QsPTPx2\TPxiR\X2jPxm\S,X2,T 
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(D-15) 

We now eliminate the variable T from | |D-13> and l |D-14) l as follows. The right-hand side of dP-lSI l can be bounded 

as 

I(XiD,X2,S;Y3\T) - I{S;X2,Y3\T) 

< H(Y3) - H(y3|XiD, X2, S) - H{S\T) + H{S\X2, Y3, T) 
= I(Xw, X2, S; Y3) - H(S\T) + H{S\X2, Y3, T) 

< I{XiD, X2, S; Ys) - H{S) + H{S\X2, Y3) 
= I(XiD,X2,S;Y3)-I{S;X2,Y3) 

= /(Xid; Y3IS, X2) + I{X2; Y3), (D-16) 

where: 

(k) holds since H(Y3|T) < H(Y3) and H{Y3\Xw,X2,S,T) = H(Y3|Xid,X2,S) (by the Markovian relation T <^ 
(XiD,X2,S)^ Y3);and 

(Z) holds since S is independent of T and H(S\Xw, Y3, T) < H(S|Xid, Y3). 
Similarly, right-hand side of | |D-I3b can be bounded as 

I{Xir; Y2IS, X2, T) + Z(Xid; Y3IS, X2, T) < I(Xir; Y2IS, X2) + I{Xw; V3IS, X2). (D-17) 

Finally, combining SD^ . <D-12t , 1 ID-I6I 1 at one hand, and iTOl l, jP-lll l, <D-17t at the other hand, we get 

R < I(Xid;Y3|S,X2)+I(X2;Y3) (D-18a) 
R < I(Xir;Y2|S,X2)+I(Xid;Y3|S,X2), (D-18b) 

where the distribution on (S, Xir, Xid, X2, Y2, Y3), obtained by marginalizing l |D-15t over the variable T, has the 
form given in | [29l l. 

We conclude that, for a given sequence of (e„, n, J?)— codes with e„ going to zero as n goes to infinity, there exists a 
probability distribution of the form i29\ such that the rate R satisfies dP-lSt . This completes the proof of TheoremU] 

E. Proof of Theorem\6\ 

The encoding and transmission scheme is as follows. Let Pi,. > 0, Pid > and D > be given such that 
Pir + Pid ^ Pi and < D < Q. Also, consider the test channel Sr = aS + Sr, where a := 1—D/Q and Sr is a Gaussian 
random variable with zero mean and variance a? = D(l - D/Q), independent from S. Using this test channel, we 
calculate E[(S - Sr)^] = D and TE[Sl] = Q-D. Let X2 ~ X(0, P2) be jointly Gaussian with Sr with E[X2Sr] = and 
independent from S, and Xsr ~ 3^(0, dPir) jointly Gaussian with (S, Sr) with E[XsrS] = and E[XsrSr] = 0, where 
< < 1. Also, let XwR ~ 3^(0, dPir) be jointly Gaussian with (X2, S) and independent of Xsr, with E[XwrS] = ou 
and E[XwrX2] = 012; and Xwd ~ ^{0, Pu) jointly Gaussian with and independent of (Xwr, Xsr, X2, S,Sr). In what 
follows, we use the random variables V, U, Ui and Ur given by l l57l l to generate the auxiliary codewords V,, Uj, 
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and Uri which we will use in the sequel. Also, recall the definition of Q, E, and a2 in J55t and l l56t , respectively, 
which we will use in the rest of this proof. 

We decompose the message W to be sent from the source into two parts W,. and Wj. The input X" from the 
source is divided into three independent parts, i.e., X" = X'^j^ + X",^ + X"^^^, where X^^^ carries a description SJ^ of the 
state S" that is intended to be recovered only at the relay and has power constraint ndPir, X",^ carries message W,. 
and has power constraint ndPir and X"^^ carries message and has power constraint nPi,;, with Pi = Pir + Pid- 
The message is sent through the relay at rate Rr and the message is sent directly to the destination at rate 
Rd. The total rate is R = R,- + Rd- 

As in the discrete case, a block Markov encoding is used. Let = {Wri,Wdi) G [l,2"'^' ]x[l, 2"'^'] denote the message 
to be transmitted in block i and s[i] denote the state controlling the channel in block i. The source quantizes s[/] 
into sr[ir,-i], where ir,_i g [1, 2"^"]. Using the aforementioned test channel, the source can encode s[i] successfully 
at the quantization rate 

Rr = I{S;Sk) 

= ^log(§). (E-1) 

In the beginning of block the relay has decoded correctly message iy„-i and the index of the description 
Sr[(R!-i] sent by the source in the previous block i — 1 (this will be justified below) and sends a Gaussian signal 
X2[Wr!-i] which carries message ifri-i and is obtained via a DPC considering Sr[(r,_i] as noncausal channel state 
information at the transmitter, as 

VP2 / \ 

X2[w„-i] = - «2^SR[iR,_i]), (E-2) 

where the components of v[f] are generated i.i.d. using the auxiliary random variable V. 

Let iR, be the index associated with the state s[i + 1] of the next block ! + 1. In the beginning of block i, the source 
sends a superposition of three Gaussian vectors, 

= xsr[(r,] + Xi„,.[w„_i,it;„] + x,„i[Wii] 



\ OP I OP 

y^iurlWri-l, Wri] = pis J -^^[f] + pu J -^^llWn-l] + x;„,[tt;„]. (E-3) 

In l |E-3t , the vectors xsr[(r,] and Xj„d[Wd!] are generated i.i.d. using the auxiliary random variables Xsr and Xwd, 
respectively; and the vector xj„,.[^*'n] has power n{l - p^^ - p^^)dPir and is independent of s[i], X2[if„-i], xsr[(r,] and 
Xj„d[Wj;/]. Furthermore, the vector xsr[ir,] carries a description Sr[(r,] of the state s[! + 1] that affects transmission in 
the next block / + 1, intended to be recovered only at the relay; the vector X2[w,/-i] carries cooperative information 
iy,/_i, and the vector xj„^[if„] carries new information w„. The vectors xsR[tR/], Xwdl^dil and x',„.[if,-,] are obtained via 
DPCs considering (s[i], Sr[(r,_i]) as noncausal channel state information at the transmitter, as 

0Pi 

xsr[(r,] = ur[/] - — — ' (1 - a)s[z] (E-4a) 

UFlr + A/2 + Fid 

^wdl^Vdt] = uili] - - — , f", ar, ~ a)(sli] - a2SR[(R,_i]) (E-4b) 
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KoA'^n] = u[/] - aE,[s[i] - a2SR[LR,-i]) (E-4c) 

where the components of UR[f], ui[;] and u[z] are generated i.i.d. using the auxiliary random variables Ur, Ui and 
U respectively. 

We now describe the decoding operations (we give simple arguments; the rigorous decoding uses joint typicality 
testing). Consider first the decoding at the relay. In block i, the relay receives 



Yili] = ^srUr,] + Pi2 aI -p^X2[w„_i] + x[„[w„] + (l + pis Al ~^)^['] + (^zt'l + ^wdi^di])- (E-5) 
The relay knows iVri-\ and and decodes the pair (w„, lri) from yiii]. The relay decodes Wn and («; successively. 



starting by zt;„. To decode Wri, the relay subtracts out the quantity (pi2 V QPirlPi^AWri-i} + aik^RiiRi-ii) from jiiA 
to make the channel equivalent to 

yili] = KoA^n] + l{sAi] - a2SR[iRi-A) + {ziU] + xsr[(r,] + x,„a[Wd,]). (E-6) 

The relay decodes message iy„ from yal'l treating signals xsr[(r,] and x-,„ii[WiiA as unknown independent noises. 
This can be done reliably as long as n is large and 

Rr < I{U; Yz) - I{U; S - azSR) 

= R{a, (1 - - pl)ePi„ eQ,N2 + 6Pi, + Pu) (E-7) 

where the equality follows through straightforward algebra which we omit here for brevity (note that the variance 
of the additive state 5(S - azSx) in (ID is E,^E[{S - azSR)^] = ^^[(i _ „2)2q - 02(02 - 2)D] := E,^Q). Next, for 
the decoding of (r,, the relay subtracts out the quantity (u[i] — (1 — a)a2^SR[(R,_i]j from y2li] to make the channel 
equivalent to 

fill] = xsrUrA + (1 - a)s[i] + (Z2[f] + x,„a[iva,]). (E-8) 
The relay decodes the index ir, from y2li] correctly as long as n is large and 

RR<mR;Y2)-I{UR;S) 

We now turn to the decoding at the destination at the end of block i. In block i, the destination receives 

y3[/] = Xi[i] + X2[Wr,-i] + s[i] + Z3[Z] 



= (pi2 J + l)x2[w„-l] + x'jaA'^r,] + X^.^Wji] + [pis J + l)s[;] + (zsff] + Xsr[(r,]). (E-10) 

At the end of block i, the destination knows message arid decodes the pair {Wri^i,zv^i_i) successively, treating 
the signal that carries the state description as unknown independent noise. It starts by decoding message it;,-,_i, 
using (y^li - 1], ysU]). Note that Wri-i is carried by both auxiliary vectors v[i] and u[/ - 1]. If n is large, it can do so 
reliably at rate 

Rr<iiv,U;y3)-KV,U;S,SR) 
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= II{V; Ya) - I{V; Sr)] + [I{U; Y^IV) - I{U; S, Sr\V)] (E-11) 

where the equality follows since the choice of (V, Sr) in l ISTt satisfies V «-> Sr S is a Markov chain. 
We first compute the term Ys)— /(V; Sr)]. Let §[;] be the estimation error of 5s[z] given sr[ir,_i] under minimum 
mean square error criterion. Since sli] and Sr[(r,_i] are jointly Gaussian, s[i] is i.i.d. Gaussian with variance 
E[(5S — S,Sr)^] = 1}D per element and is independent from Sr[(k,_i]. Thus, we can alternatively write the output 
as 

y3[i] = (pi2 ^^ + l)x2[zo„-i] + x',„[Wri] + Xj„d[wA] + 5sr[(r,-i] + {z^li] + XsrUr,] + §[/]). (E-12) 

With the choice of the auxiliary random variable V as in (57) and that of the associated Costa's scale factor set 
to its optimal value as in ((56), the destination decodes the vector v[i] correctly from y3[;] at rate 

1 / (Pi2 ^JBPl,■ + V^I)^ \ 
I(V; Y3) - W; Sr) = - log 1 + ^^'^^ " J — (E-13) 

where the equality follows through straightforward algebra. Let us now compute the term [I(iJ; Y3 1 V)—1{[1; 5,5x1^)]. 
Observing that the destination can peel off v[/ — 1] from y3[; — 1] to make the channel equivalent to 



I QP 

- 1] = y3[' - 1] - ((Pl2 -J + l)x2[ty„-2] + a25SR[(R,_2]) 

= y!,„\iVri-\\ + lAi - 1] - (x^lH\^Ri-^A + {^zV - 1] + xsr[(r,-i] + x„,d[it;d,_i]), (E-14) 

it is easy to see that, if n is large and with the choice of the auxiliary random variable !i as in (52), the destination 
obtains the vector u[/ — 1] correctly from y3[; - 1] at rate 

I(!J; Ysiy) - 1{U; S, 5r\V) = I{U; Y3) - J(!i; £(S - ajSR)) 

= R{a, (1 - - pi)ePir, N3 + eP,r + Pu) (E-IS) 

where the last equality follows through straightforward algebra. 

Finally, the destination can peel off u[/ - 1] from y3[; - 1] to make the channel equivalent to 

fsli - 1] = fsU - 1] - (x;„r[ron-i] + aE,(s[i - 1] - a2SR[(R,_2])) 

= Xwd[Wd,-i] + ^(1 - a){s[i - 1] - a2^SR[iR,_2]) + {zsli - 1] + xsr[(r,-i]). (E-16) 

From | |E-I6b , it is easy to see that if n is large, and with the choice of the auxiliary random variable Ui as in (57), the 
destination obtains the vector ui[z — 1] (which carries message tfdi-i) correctly at rate 

Rd < m; Y3) - I{Ui; 5(1 - a)(S - a2SR)) 
1 

= - log(l + ). (E-17) 

Finally, for given D, adding l lE-7t and | |E-17| |, we obtain the first term of the minimization in (53); and adding 
| |E-I3b , JE-15I I and JE-17t , we obtain the second term of the minimization in (53) . Also, similar to in the proof of 
Theorem|5l observing that the rate terms in (53) decrease with D, we obtain the lower bound in Theorem|6]by taking 
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the equality in l lE-9l l and maximizing the minimization in l l53t over Pir > 0, Pid ^ such that < Pir + Pid ^ Pi, 
6 G [0, 1], pi2 G [0, 1] and pi, G [-1, 0] such that < p\^ + < 1 and a G R such that the RHS of (EPtI is non-negative 
and the sum of the RHS of | |E-I5t and the RHS of | |E-I7t is non-negative. This completes the proof. 

F. Proof of Theorem^ 

In this section, we first use the upper bound for the DM case in Theorem |4] to obtain a new upper bound on 
the capacity of the state-dependent additive Gaussian model {32). Then, we show that this new upper bound is 
maximized by jointly Gaussian (S, Xi^, Xid, X2, Z2, Z3). 

From TheoremUl we have that, given any (e„, n, R) sequence of codes with average error probability P" — » as 
n — » H-oo, the transmission rate R satisfies 

R<mm {j(Xir; YzIXj, S), /(Xz; Y3)} + /(Xid; YslXi, S) (F-1) 

for some joint measure of the form 

Ps,XiR,Xw,X2,Y2,Y3 = Qsfx2PxiR|X2fXiD|X2,sWY2|XiR,sWy3|XiD,X2,S- (F-2) 

Since the channel structure l l32t satisfies Wy2\Xir,X2,s = l^r2|XiK,S/ it follows that 

I{Xm; Y2\S, X2) = H{Y2\S, X2) - H(y2|S, X2, Xjr) 

= H(y2|S,X2)-H(y2|S,XiK) 
<H(y2|S)-H(y2|S,XiR) 

= -r(XiR;y2|S). (F-3) 
An upper bound on the capacity of the channel ([32) is then given by 

R < min {j(Xir; y2|S), I(X2; Y3)] + /(Xid; Y3IX2, S) (F-4) 
for some joint measure of the form 

Ps,Xi,i,XiD,X2,r2,r3 = QsPx2-PxiRPxiD|X2,sl^r2|XiR,sWY3|XiD,X2,s- (F-5) 



(Note that, in contrast to in Theorem|4]and | |F-2| |, the inputs Xir and X2 are independent in JF-5t ). 
Fix a joint distribution on (S, Xir, Xid, X2, y2, Yj,) of the form l lF-5b satisfying 

nxl„] = Pm < Pm, nxlj,] = Pw < PiD, nxj] = P2 < P2, 

E[XidX2] = ai2, nXwS] = ais- (F-6) 
We shall also use the correlation coefficients pu € [-1, 1], pis G [—1, 1] defined as 

P12 = — pis = ■ 



We first compute the first term in the minimization on the RHS of l lF-4t . We have 
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R < I{Xm; Y2\S) + I(Xid; Y3IX2, S) (F-8) 
= h{X,R + Z2IS) - h{Z2) + h{X,D + Z3IX2, S) - h{Z3) (F-9) 
< hiXiR + Z2) - ;i(Z2) + h{XiD + Z3IX2, S) - h{Z3) (F-10) 

where: (a) holds since conditioning reduces entropy; and (b) holds since the conditional differential entropy 
h{XiR + Z2) is maximized if (Xir, Z2) are jointly Gaussian, and the conditional differential entropy h{XiD + Z3IX2, S) 
is maximized if (S, Xid, X2, Z3) are jointly Gaussian. 

We now compute the term [/(X2; Y3) + I(Xid; y3|X2, S)]. We have 

I{X2; Y3) + /(Xid; Y3IX2, S) *=' J(Xid; y3|X2, S) + /(Xj; Y3) - KXr, S) 

= I(Xid; ¥31X2, S) + J(X2; YalS) - Z(X2; S|y3) 

= h{Y3\s) - ;i(y3is, XiD, X2) - ;i(S|y3) + h{s\x^, y^) 

= h{Y3) - h{S) + h{S\X2, Y3) - h{Z3) (F-12) 

where: (c) follows since X2 and S are independent. 
For fixed second moments JF-6I I, we have 



h{Y3) < i log(27Te)(PiD +P2 + 2ai2 + 2ai, + Q + N3), (F-13) 

where equality is attained if Y3 is Gaussian. Similarly, the term /z(S|X2, Ys) is maximized if (S, X2, Y3) are jointly 
Gaussian. Let S(X2, Y3) = ]E[S|X2, Y3] be the MMSE estimator of S given (X2, Y3), i.e., 

S(X2,y3)=E[S|X2,XiD + S+Z3] 

= yiX2+72(XiD+S + Z3) (F-14) 



with 



oniQ + ais) 



p2{PiD + 2ais + Q + N3)-aj2 
PiiQ + ffi.) 
p2{PiD + 2a,s + Q + N3)-al2 



ft(S|X2, Ya) = h{S - S(X2, Y3)|X2, Y3) 

< h{S - 71X2 - y2(XiD + S + Z3)) 

= i log(27ie)E[(s - 71X2 - 72(XiD + S + Z3))'] 

1 . ^ QPlDP2 + P2N3Q-a^/2-a^2Q ^ 
= 2 P2(P,..2a..Q.N3)-a?, )' ^'"^'^ 



January 19, 2013 



DRAFT 



49 



where the inequaUty is attained with equality if S, Xid, X2, Yj, are jointly Gaussian. Then, from | |F-I2t , | |F-I3b and 
dF-16t and straightforward algebra, we obtain 



I(X2;y3)+/(XiD;y3lS,X2 



Pid(1 - - pD + ( VQ + pi. VPid)2 + N3 
+ + )■ 



For convenience, let us now define the function 0i(Pir,Pid/Pi2/Pis) as the RHS of l IF-llt and the function 
®i{PiDr P2, P12, p2s) as the RHS of | |F-17| |. From the above analysis, the capacity of the channel is upper-bounded as 



C < max min{0i(PiR,PiD,pi2,pis),©2(PiD,P2,pi2,pis)l 
where the maximization is over all covariance matrices of (Xi^, Xid, X2, S) of the form 



(F-18) 



that satisfy 

and have non-negative discriminant, 
i.e., for Q > 0, 



PlR 





PlR < P 





PlR 

P12 VP1DP2 

pis a/PidQ 











P12 yPmPi pis 



IR, 



Pw < P 



ID, 



Pi 



P2<P2 





Q 



(F-19) 



QPirPidP2(1 - p?2 - P2s) > 0, 



(F-20) 



(F-21) 



(F-22) 



Investigating 0i(Pir,Pid/ pi2/ pis) and 02(PiD/P2/ pi2/ pis), it can be seen that it suffices to consider pi2 e [0, 1] 
and pis e [—1, 0] for the maximization in jF-18l l. 

Also, it is easy to see that, for fixed Pid, the functions 0i(Pir,Pid/Pi2/Pis) and 02(Pid/P2/ pi2/ pis) increase 
monotonically with Pir and P2- So, for fixed Pid, they are maximized at Pjr = Pir and Pi = Pi- To complete the 
proof, we should show that 0i(Pir, Pw, pii, pis) and 02(Pid/ P2/ pi2/ pis) are also maximized at Pid = Pid- 
It is clear that the fimction 0i(Pir,Pid/ pi2/ pis) increases with Pid. The term 02(PiD/P2/Pi2/Pis) can be seen as 
the sum rate of a two-user state-dependent MAC with state information known to one encoder, both encoders 
sending a common message and the informed encoder sending, in addition, an individual message [6]. As argued 
in [6], this sum rate increases with the power of the informed encoder [6, Appendix E], i.e., Pid here. This concludes 
the proof of Theorem[7l 
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G. Proof of Theorem\8\ 

1) Converse Part: the proof of the converse part of Theorem|8]follows by noticing that the computation of the upper 
bound l lF-4t in the proof of Theorem[7|for the special case l l62t , and using the same jointly Gaussian distribution as 
in Appendix |F1 gives the RHS of (63). 

2) Achievability Part: Recall the lower bound in Corollary[T] With the choice Sr = So = 0, Ur = Ud = 0, U = Xm 
independent of S and V = X2 independent of S, we obtain 

= max min { I{X^r; ¥21X2), /(Xjr, X2; Y3) ) + Y3IX1R, X2) - /(!Ji; S|Xir, Xz)]^ (G-1) 

where [x]^ := max{x, 0) and the maximization is over all measures of the form 

Ps,Ui,XiR,XiD,X2,Y2.Y3 = QsPx2-PXi«|X2-PUi,Xid1S,X2Wy21S,XirWy3|Xio,X2,S- (G-2) 

In the proof of the direct part of Theorem |8] we compute the rate l IG-lt using an appropriate jointly Gaussian 
distribution on (S, Ui, Xir, Xin, X2). The algebra in this section is similar to that in the proof of [12, Theorem 3] and 
[6, Theorem 6]. 

We first compute the term [/(Lfi; YsIXir, X2) — l{Ui; S|Xir, X2)] in the RHS of l IG-ll l because this gives insights about 
the distribution that we should use to compute the lower bound. We assume that Xir, Xio and X2 are jointly 
Gaussian random variables with zero-mean and variance Pjr, Pid and P2/ respectively. The random variables Xjr 
and X2 are independent and independent of the state S. The random variable Xjo is independent of Xjr and jointly 



Gaussian with (S, X2), with E[XioX2] = pu y/PwPi and E[Xii3S] = pis y/PwQ, for some correlation coefficients 
pi2 e [-1,1] and pts e [-1,1]. 

Let XiD = E[Xid|S, XiR,X2] be the optimal linear estimator of Xid given (S, Xir,X2) under minimum mean 
square error criterion, and X^^ be the resulting estimation error (note that E[Xid|S, Xir,X2] = E[Xid|S, X2]). The 
estimator Xid and the estimation error X^^ are given by 



We can then write Y3 in 



Let now 



Md - P12 -v/ ~p;~-^2 + pis 
V 1 2 

XjQ = XiD - XiD. 

alternatively as 

fP^. 




Pw, 



Y3 = x;^ + (1 + pi2 ^/ -^)X2 + (1 + pis y-^)s + Z3. 



:= Ya - E[Y3|Xir, X2] = x;^ + (1 + pis -^)S + Z3. 



(G-3) 
(G-4) 

(G-5) 
(G-6) 



Noticing now that X^^ is independent of the state S in 1 IG-6I 1, it is clear that an optimal choice of the associated 
auxiliary random variable !ii is 



Ui = X[j, + a(l + Pis 



where a is Costa's parameter given by 



a = 



E[X[l] 



Pid(1 - p?2 - pD 



E[X;y + E[Z2] Pid(1 - pI^ - pU + N3 



(G-7) 



(G-8) 
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Then we can easily show that 



I{Ui; Y3\X,R, X2) - I{Ui; S|Xir, X2) = I{Ur, Y') - S) 



By substituting X^j^ in | |G-7| |, we get 



PlD 

Ui = XiD - P12 y "p~-^2 + «optS 



(G-9) 



(G-10) 



with 



PiDi^-Pu-pi 



pis 



jPw^ 

n y 



(G-ll) 



Now, it is easy to see that, with the choice l IG-lOt , we have 

I(!Ji; YalXiR, X2) - I{Ur, S\X,r, X2) = Y^) - S) 



= 2l°S 



1 + 



IE[Xfp] 



1 + 



Pid(1 - P12 - Pi 



2 \\ 



(G-12) 



We now compute the terms I(XiR;y2|X2) and Z(X2;y3). It is easy to see that, with the aforementioned jointly 
Gaussian input distribution. 



/(XiK;y2|X2) = J(XiR;y2) 



^1 



(G-13) 



Also, we have 



(«) 



/(XiR,X2;y3) =I(X2;y3) 

= h{Y3) - h{Y3\X2) 

= HYi) - h{X[^ + E[Xid|X2] + ]E[Xid|S] +S + Z3IX2) 
(6) 



2 



h{Y3)-h{X[^+E[XiD\S] + S + Z3 



E[(XiD + X2 + S)2] + ]E[Z2] 



IE[Xfj,] + ]E[(S + E[Xid|S])2] + E[Z2]^ 



= ilos(l 



(G-14) 



2'"°^' PiD(l-p?2-ptJ + (VQ + pi.V^)2+N3^ 
where: (a) holds since Xir is independent of (X2, y3), (b) holds since X^^ and S are independent of X2, and (c) 
follows through straightforward algebra. 

Adding | |G-I2t and | |G-I3t we obtain the first term of the minimization in ([63J; and adding jG-12t and l lG-14t we 
obtain the second term of the minimization in (63). 

Finally, we obtain the capacity in Theorem|8]by maximizing the RHS of d63> over all possible values of pi2 e [—1,1] 
and pis G [—1, 1]. Investigating the two terms of the minimization, we can easily see that it suffices to consider 
pi2 G [0, 1] and pis G [-1, 0]. This concludes the proof of Theorem|8l 
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